datatable.median()¶
Calculate the median value for each column from cols
.
Parameters¶
cols
Expr
Input columns.
return
Expr
f-expression having one row, and the same names, stypes and
number of columns as in cols
.
except
TypeError
The exception is raised when one of the columns from cols
has a non-numeric type.
See Also¶
Examples¶
from datatable import dt, f, by
df = dt.Frame({'A': [1, 1, 2, 1, 2],
'B': [None, 2, 3,4, 5],
'C': [1, 2, 1, 1, 2]})
df
A | B | C | ||
---|---|---|---|---|
int32 | int32 | int32 | ||
0 | 1 | NA | 1 | |
1 | 1 | 2 | 2 | |
2 | 2 | 3 | 1 | |
3 | 1 | 4 | 1 | |
4 | 2 | 5 | 2 |
Get the median from column A:
df[:, dt.median(f.A)]
A | ||
---|---|---|
float64 | ||
0 | 1 |
Get the median of multiple columns:
df[:, dt.median([f.A, f.B])]
A | B | ||
---|---|---|---|
float64 | float64 | ||
0 | 1 | 3.5 |
Same as above, but more convenient:
df[:, dt.median(f[:2])]
A | B | ||
---|---|---|---|
float64 | float64 | ||
0 | 1 | 3.5 |
You can pass in a dictionary with new column names:
df[:, dt.median({"A_median": f.A, "C_mid": f.C})]
A_median | C_mid | ||
---|---|---|---|
float64 | float64 | ||
0 | 1 | 1 |
In the presence of by()
, it returns the median of each column
per group:
df[:, dt.median({"A_median": f.A, "B_median": f.B}), by("C")]
C | A_median | B_median | ||
---|---|---|---|---|
int32 | float64 | float64 | ||
0 | 1 | 1 | 3.5 | |
1 | 2 | 1.5 | 3.5 |
The content on this page is licensed under the Creative Commons Attribution 4.0 License
(CC BY 4.0) ,
and code samples are licensed under the MIT License.