datatable.cumsum()¶

datatable.

cumsum

(

cols

,

reverse=False

)

source tests

Added in version 1.1.0

For each column from cols calculate cumulative sum. The sum of the missing values is calculated as zero. In the presence of by(), the cumulative sum is computed within each group.

Parameters¶

cols

FExpr

Input data for cumulative sum calculation.

reverse

bool

If False, computation is done from top to bottom. If True, it is done from bottom to top.

return

FExpr

f-expression that converts input columns into the columns filled with the respective cumulative sums.

except

TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

Examples¶

Create a sample datatable frame:

from datatable import dt, f, by
DT = dt.Frame({"A": [2, None, 5, -1, 0],
               "B": [None, None, None, None, None],
               "C": [5.4, 3, 2.2, 4.323, 3],
               "D": ['a', 'a', 'b', 'b', 'b']})
ABCD
int32voidfloat64str32
02NA5.4a
1NANA3a
25NA2.2b
3-1NA4.323b
40NA3b
5 rows × 4 columns

Calculate cumulative sum in a single column:

DT[:, dt.cumsum(f.A)]
A
int64
02
12
27
36
46
5 rows × 1 column

Calculate the cumulative sum from bottom to top:

DT[:, dt.cumsum(f.A, reverse=True)]
A
int64
06
14
24
3-1
40
5 rows × 1 column

Calculate cumulative sums in multiple columns:

DT[:, dt.cumsum(f[:-1])]
ABC
int64int64float64
0205.4
1208.4
27010.6
36014.923
46017.923
5 rows × 3 columns

For a grouped frame calculate cumulative sums within each group:

DT[:, dt.cumsum(f[:]), by('D')]
DABC
str32int64int64float64
0a205.4
1a208.4
2b502.2
3b406.523
4b409.523
5 rows × 4 columns

	A	B	C	D
	int32	void	float64	str32
0	2	NA	5.4	a
1	NA	NA	3	a
2	5	NA	2.2	b
3	-1	NA	4.323	b
4	0	NA	3	b

	D	A	B	C
	str32	int64	int64	float64
0	a	2	0	5.4
1	a	2	0	8.4
2	b	5	0	2.2
3	b	4	0	6.523
4	b	4	0	9.523