datatable.qcut()

Added in version 0.11

Bin all the columns from cols into intervals with approximately equal populations. Thus, the intervals are chosen according to the sample quantiles of the data.

If there are duplicate values in the data, they will all be placed into the same bin. In extreme cases this may cause the bins to be highly unbalanced.

Parameters

cols
FExpr

Input data for quantile binning.

nquantiles
int | List[int]

When a single number is specified, this number of quantiles will be used to bin each column from cols.

When a list or a tuple is provided, each column will be binned by using its own number of quantiles. In the latter case, the list/tuple length must be equal to the number of columns in cols.

return
FExpr

f-expression that converts input columns into the columns filled with the respective quantile ids.

Examples

Bin two-column frame by using the same number of quantiles for both columns:

from datatable import dt, f, qcut DT = dt.Frame([range(5), [3, 14, 15, 92, 6]]) DT[:, qcut(f[:], nquantiles = 3)]
C0C1
int32int32
000
101
212
322
420

Bin two-column frame by using column-specific number of quantiles:

from datatable import dt, f, qcut DT = dt.Frame([range(5), [3, 14, 15, 92, 6]]) DT[:, qcut(f[:], nquantiles = [3, 5])]
C0C1
int32int32
000
102
213
324
421

See also

cut() – function for equal-width interval binning.