datatable.median()

Calculate the median value for each column from cols.

Parameters

cols
FExpr

Input columns.

return
Expr

f-expression having one row, and the same names and number of columns as in cols. The column stypes are float32 for float32 columns, and float64 for all the other numeric types.

except
TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

Examples

from datatable import dt, f, by df = dt.Frame({'A': [1, 1, 2, 1, 2], 'B': [None, 2, 3, 4, 5], 'C': [1, 2, 1, 1, 2]}) df
ABC
int32int32int32
01NA1
1122
2231
3141
4252

Get the median from column A:

df[:, dt.median(f.A)]
A
float64
01

Get the median of multiple columns:

df[:, dt.median([f.A, f.B])]
AB
float64float64
013.5

Same as above, but more convenient:

df[:, dt.median(f[:2])]
AB
float64float64
013.5

You can pass in a dictionary with new column names:

df[:, dt.median({"A_median": f.A, "C_mid": f.C})]
A_medianC_mid
float64float64
011

In the presence of by(), it returns the median of each column per group:

df[:, dt.median({"A_median": f.A, "B_median": f.B}), by("C")]
CA_medianB_median
int32float64float64
0113.5
121.53.5

See Also

  • mean() – function to calculate mean values.

  • sd() – function to calculate standard deviation.