datatable.update()

Create new or update existing columns within a frame.

This expression is intended to be used at “j” place in DT[i, j] call. It takes an arbitrary number of key/value pairs each describing a column name and the expression for how that column has to be created/updated.

Examples

from datatable import dt, f, by, update DT = dt.Frame([range(5), [4, 3, 9, 11, -1]], names=("A", "B")) DT
AB
int32int32
004
113
229
3311
44-1

Create new columns and update existing columns:

DT[:, update(C = f.A * 2, D = f.B // 3, A = f.A * 4, B = f.B + 1)] DT
ABCD
int32int32int32int32
00501
14421
281043
3121263
41608-1

Add new column with unpacking; this can be handy for dynamicallly adding columns with dictionary comprehensions, or if the names are not valid python keywords:

DT[:, update(**{"extra column": f.A + f.B + f.C + f.D})] DT
ABCDextra column
int32int32int32int32int32
005016
1442111
28104325
312126333
41608-123

You can update a subset of data:

DT[f.A > 10, update(A = f.A * 5)] DT
ABCDextra column
int32int32int32int32int32
005016
1442111
28104325
360126333
48008-123

You can also add a new column or update an existing column in a groupby operation, similar to SQL’s window operation, or pandas transform():

df = dt.Frame("""exporter assets liabilities False 5 1 True 10 8 False 3 1 False 24 20 False 40 2 True 12 11""") # Get the ratio for each row per group df[:, update(ratio = dt.sum(f.liabilities) * 100 / dt.sum(f.assets)), by(f.exporter)] df
exporterassetsliabilitiesratio
bool8int32int32float64
005133.3333
1110886.3636
203133.3333
30242033.3333
4040233.3333
51121186.3636