datatable.median()¶
Calculate the median value for each column from cols.
Parameters¶
cols
ExprInput columns.
return
Exprf-expression having one row, and the same names, stypes and
number of columns as in cols.
except
TypeErrorThe exception is raised when one of the columns from cols
has a non-numeric type.
See Also¶
Examples¶
from datatable import dt, f, by
df = dt.Frame({'A': [1, 1, 2, 1, 2],
'B': [None, 2, 3,4, 5],
'C': [1, 2, 1, 1, 2]})
df
| A | B | C | ||
|---|---|---|---|---|
| int32 | int32 | int32 | ||
| 0 | 1 | NA | 1 | |
| 1 | 1 | 2 | 2 | |
| 2 | 2 | 3 | 1 | |
| 3 | 1 | 4 | 1 | |
| 4 | 2 | 5 | 2 |
Get the median from column A:
df[:, dt.median(f.A)]
| A | ||
|---|---|---|
| float64 | ||
| 0 | 1 |
Get the median of multiple columns:
df[:, dt.median([f.A, f.B])]
| A | B | ||
|---|---|---|---|
| float64 | float64 | ||
| 0 | 1 | 3.5 |
Same as above, but more convenient:
df[:, dt.median(f[:2])]
| A | B | ||
|---|---|---|---|
| float64 | float64 | ||
| 0 | 1 | 3.5 |
You can pass in a dictionary with new column names:
df[:, dt.median({"A_median": f.A, "C_mid": f.C})]
| A_median | C_mid | ||
|---|---|---|---|
| float64 | float64 | ||
| 0 | 1 | 1 |
In the presence of by(), it returns the median of each column
per group:
df[:, dt.median({"A_median": f.A, "B_median": f.B}), by("C")]
| C | A_median | B_median | ||
|---|---|---|---|---|
| int32 | float64 | float64 | ||
| 0 | 1 | 1 | 3.5 | |
| 1 | 2 | 1.5 | 3.5 |
The content on this page is licensed under the Creative Commons Attribution 4.0 License
(CC BY 4.0) ,
and code samples are licensed under the MIT License.