datatable.cut()

Added in version 0.11

For each column from cols bin its values into equal-width intervals, when nbins is specified, or into arbitrary-width intervals, when interval edges are provided as bins.

Parameters

cols
FExpr

Input data for equal-width interval binning.

nbins
int | List[int]

When a single number is specified, this number of bins will be used to bin each column from cols. When a list or a tuple is provided, each column will be binned by using its own number of bins. In the latter case, the list/tuple length must be equal to the number of columns in cols.

bins
List[Frame]

A list or a tuple of frames for binning the corresponding columns from cols. Each frame should have only one column that is populated with the edges of the binning intervals in strictly increasing order. Number of elements in bins must be equal to the number of columns in cols.

right_closed
bool

Each binning interval is half-open. This flag indicates whether the right edge of the interval is closed, or not.

return
FExpr

f-expression that converts input columns into the columns filled with the respective bin ids.

Examples

Bin one-column frame by specifying a number of bins:

from datatable import dt, f, cut DT = dt.Frame([1, 3, 5]) DT[:, cut(f[:], nbins = 5)]
C0
int32
00
12
24

Bin one-column frame by specifying edges of the binning intervals:

from datatable import dt, f, cut DT = dt.Frame([1, 3, 5]) BINS = [dt.Frame(range(5))] DT[:, cut(f[:], bins = BINS)] # Note, "5" goes out of bounds and is binned as "NA"
C0
int32
00
12
2NA

Bin two-column frame by specifying a number of bins for each column separately:

from datatable import dt, f, cut DT = dt.Frame([[1, 3, 5], [5, 7, 9]]) DT[:, cut(f[:], nbins = [5, 10])]
C0C1
int32int32
000
124
249

Bin two-column frame by specifying edges of the binning intervals:

from datatable import dt, f, cut DT = dt.Frame([[1, 3, 5], [0.1, -0.1, 1.5]]) BINS = [dt.Frame(range(10)), dt.Frame([-1.0, 0, 1.0, 2.0])] DT[:, cut(f[:], bins = BINS)]
C0C1
int32int32
001
120
242

See also

qcut() – function for equal-population binning.