datatable.median()

Calculate the median value for each column from cols.

Parameters

cols
Expr

Input columns.

return
Expr

f-expression having one row, and the same names, stypes and number of columns as in cols.

except
TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

See Also

  • mean() – function to calculate mean values.

  • sd() – function to calculate standard deviation.

Examples

from datatable import dt, f, by df = dt.Frame({'A': [1, 1, 2, 1, 2], 'B': [None, 2, 3,4, 5], 'C': [1, 2, 1, 1, 2]}) df
ABC
int32int32int32
01NA1
1122
2231
3141
4252

Get the median from column A:

df[:, dt.median(f.A)]
A
float64
01

Get the median of multiple columns:

df[:, dt.median([f.A, f.B])]
AB
float64float64
013.5

Same as above, but more convenient:

df[:, dt.median(f[:2])]
AB
float64float64
013.5

You can pass in a dictionary with new column names:

df[:, dt.median({"A_median": f.A, "C_mid": f.C})]
A_medianC_mid
float64float64
011

In the presence of by(), it returns the median of each column per group:

df[:, dt.median({"A_median": f.A, "B_median": f.B}), by("C")]
CA_medianB_median
int32float64float64
0113.5
121.53.5