datatable.mean()

Calculate the mean value for each column from cols.

Parameters

cols
Expr

Input columns.

return
Expr

f-expression having one row, and the same names and number of columns as in cols. The column stypes are float32 for float32 columns, and float64 for all the other numeric types.

except
TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

See Also

  • median() – function to calculate median values.

  • sd() – function to calculate standard deviation.

Examples

from datatable import dt, f, by df = dt.Frame({'A': [1, 1, 2, 1, 2], 'B': [None, 2, 3,4, 5], 'C': [1, 2, 1, 1, 2]}) df
ABC
int32int32int32
01NA1
1122
2231
3141
4252

Get the mean from column A:

df[:, dt.mean(f.A)]
A
float64
01.4

Get the mean of multiple columns:

df[:, dt.mean([f.A, f.B])]
AB
float64float64
01.43.5

Same as above, but applying to a column slice:

df[:, dt.mean(f[:2])]
AB
float64float64
01.43.5

You can pass in a dictionary with new column names:

df[:, dt.mean({"A_mean": f.A, "C_avg": f.C})]
A_meanC_avg
float64float64
01.41.4

In the presence of by(), it returns the average of each column per group:

df[:, dt.mean({"A_mean": f.A, "B_mean": f.B}), by("C")]
CA_meanB_mean
int32float64float64
011.333333.5
121.53.5