datatable.mean()¶
Calculate the mean value for each column from cols
.
Parameters¶
cols
Expr
Input columns.
return
Expr
f-expression having one row, and the same names and number of columns
as in cols
. The column stypes are float32
for
float32
columns, and float64
for all the other numeric types.
except
TypeError
The exception is raised when one of the columns from cols
has a non-numeric type.
See Also¶
Examples¶
from datatable import dt, f, by
df = dt.Frame({'A': [1, 1, 2, 1, 2],
'B': [None, 2, 3,4, 5],
'C': [1, 2, 1, 1, 2]})
df
A | B | C | ||
---|---|---|---|---|
int32 | int32 | int32 | ||
0 | 1 | NA | 1 | |
1 | 1 | 2 | 2 | |
2 | 2 | 3 | 1 | |
3 | 1 | 4 | 1 | |
4 | 2 | 5 | 2 |
Get the mean from column A:
df[:, dt.mean(f.A)]
A | ||
---|---|---|
float64 | ||
0 | 1.4 |
Get the mean of multiple columns:
df[:, dt.mean([f.A, f.B])]
A | B | ||
---|---|---|---|
float64 | float64 | ||
0 | 1.4 | 3.5 |
Same as above, but applying to a column slice:
df[:, dt.mean(f[:2])]
A | B | ||
---|---|---|---|
float64 | float64 | ||
0 | 1.4 | 3.5 |
You can pass in a dictionary with new column names:
df[:, dt.mean({"A_mean": f.A, "C_avg": f.C})]
A_mean | C_avg | ||
---|---|---|---|
float64 | float64 | ||
0 | 1.4 | 1.4 |
In the presence of by()
, it returns the average of each column per group:
df[:, dt.mean({"A_mean": f.A, "B_mean": f.B}), by("C")]
C | A_mean | B_mean | ||
---|---|---|---|---|
int32 | float64 | float64 | ||
0 | 1 | 1.33333 | 3.5 | |
1 | 2 | 1.5 | 3.5 |
The content on this page is licensed under the Creative Commons Attribution 4.0 License
(CC BY 4.0) ,
and code samples are licensed under the MIT License.