datatable.mean()¶

datatable.

mean

(

cols

)

source tests

Calculate the mean value for each column from cols.

Parameters¶

cols

Expr

Input columns.

return

Expr

f-expression having one row, and the same names and number of columns as in cols. The column stypes are float32 for float32 columns, and float64 for all the other numeric types.

except

TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

Examples¶

from datatable import dt, f, by

df = dt.Frame({'A': [1, 1, 2, 1, 2],
               'B': [None, 2, 3,4, 5],
               'C': [1, 2, 1, 1, 2]})

df
ABC
int32int32int32
01NA1
1122
2231
3141
4252
5 rows × 3 columns

Get the mean from column A:

df[:, dt.mean(f.A)]
A
float64
01.4
1 row × 1 column

Get the mean of multiple columns:

df[:, dt.mean([f.A, f.B])]
AB
float64float64
01.43.5
1 row × 2 columns

Same as above, but applying to a column slice:

df[:, dt.mean(f[:2])]
AB
float64float64
01.43.5
1 row × 2 columns

You can pass in a dictionary with new column names:

df[:, dt.mean({"A_mean": f.A, "C_avg": f.C})]
A_meanC_avg
float64float64
01.41.4
1 row × 2 columns

In the presence of by(), it returns the average of each column per group:

df[:, dt.mean({"A_mean": f.A, "B_mean": f.B}), by("C")]
CA_meanB_mean
int32float64float64
011.333333.5
121.53.5
2 rows × 3 columns

	A	B	C
	int32	int32	int32
0	1	NA	1
1	1	2	2
2	2	3	1
3	1	4	1
4	2	5	2

	A
	float64
0	1.4

	A	B
	float64	float64
0	1.4	3.5

	A	B
	float64	float64
0	1.4	3.5

datatable.mean()¶

Parameters¶

See Also¶

Examples¶