datatable.median()¶

datatable.

median

(

cols

)

source tests

Calculate the median value for each column from cols.

Parameters¶

cols

Expr

Input columns.

return

Expr

f-expression having one row, and the same names, stypes and number of columns as in cols.

except

TypeError

The exception is raised when one of the columns from cols has a non-numeric type.

Examples¶

from datatable import dt, f, by

df = dt.Frame({'A': [1, 1, 2, 1, 2],
               'B': [None, 2, 3,4, 5],
               'C': [1, 2, 1, 1, 2]})

df
ABC
int32int32int32
01NA1
1122
2231
3141
4252
5 rows × 3 columns

Get the median from column A:

df[:, dt.median(f.A)]
A
float64
01
1 row × 1 column

Get the median of multiple columns:

df[:, dt.median([f.A, f.B])]
AB
float64float64
013.5
1 row × 2 columns

Same as above, but more convenient:

df[:, dt.median(f[:2])]
AB
float64float64
013.5
1 row × 2 columns

You can pass in a dictionary with new column names:

df[:, dt.median({"A_median": f.A, "C_mid": f.C})]
A_medianC_mid
float64float64
011
1 row × 2 columns

In the presence of by(), it returns the median of each column per group:

df[:, dt.median({"A_median": f.A, "B_median": f.B}), by("C")]
CA_medianB_median
int32float64float64
0113.5
121.53.5
2 rows × 3 columns

	A	B	C
	int32	int32	int32
0	1	NA	1
1	1	2	2
2	2	3	1
3	1	4	1
4	2	5	2

	A_median	C_mid
	float64	float64
0	1	1

	C	A_median	B_median
	int32	float64	float64
0	1	1	3.5
1	2	1.5	3.5

datatable.median()¶

Parameters¶

See Also¶

Examples¶