datatable.cumsum()

Added in version 1.1.0

For each column from cols calculate cumulative sum. The sum of the missing values is calculated as zero. In the presence of by(), the cumulative sum is computed within each group.

Parameters

cols
FExpr

Input data for cumulative sum calculation.

reverse
bool

If False, computation is done from top to bottom. If True, it is done from bottom to top.

return
FExpr

f-expression that converts input columns into the columns filled with the respective cumulative sums.

except
TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

Examples

Create a sample datatable frame:

from datatable import dt, f, by DT = dt.Frame({"A": [2, None, 5, -1, 0], "B": [None, None, None, None, None], "C": [5.4, 3, 2.2, 4.323, 3], "D": ['a', 'a', 'b', 'b', 'b']})
ABCD
int32voidfloat64str32
02NA5.4a
1NANA3a
25NA2.2b
3-1NA4.323b
40NA3b

Calculate cumulative sum in a single column:

DT[:, dt.cumsum(f.A)]
A
int64
02
12
27
36
46

Calculate the cumulative sum from bottom to top:

DT[:, dt.cumsum(f.A, reverse=True)]
A
int64
06
14
24
3-1
40

Calculate cumulative sums in multiple columns:

DT[:, dt.cumsum(f[:-1])]
ABC
int64int64float64
0205.4
1208.4
27010.6
36014.923
46017.923

For a grouped frame calculate cumulative sums within each group:

DT[:, dt.cumsum(f[:]), by('D')]
DABC
str32int64int64float64
0a205.4
1a208.4
2b502.2
3b406.523
4b409.523