datatable.cut()¶
Added in version 0.11
For each column from cols
bin its values into equal-width intervals,
when nbins
is specified, or into arbitrary-width intervals,
when interval edges are provided as bins
.
Parameters¶
cols
FExpr
Input data for equal-width interval binning.
nbins
int
| List[int]
right_closed
bool
Each binning interval is half-open. This flag indicates whether the right edge of the interval is closed, or not.
return
FExpr
f-expression that converts input columns into the columns filled with the respective bin ids.
Examples¶
Bin one-column frame by specifying a number of bins:
from datatable import dt, f, cut
DT = dt.Frame([1, 3, 5])
DT[:, cut(f[:], nbins = 5)]
C0 | ||
---|---|---|
int32 | ||
0 | 0 | |
1 | 2 | |
2 | 4 |
Bin one-column frame by specifying edges of the binning intervals:
from datatable import dt, f, cut
DT = dt.Frame([1, 3, 5])
BINS = [dt.Frame(range(5))]
DT[:, cut(f[:], bins = BINS)] # Note, "5" goes out of bounds and is binned as "NA"
C0 | ||
---|---|---|
int32 | ||
0 | 0 | |
1 | 2 | |
2 | NA |
Bin two-column frame by specifying a number of bins for each column separately:
from datatable import dt, f, cut
DT = dt.Frame([[1, 3, 5], [5, 7, 9]])
DT[:, cut(f[:], nbins = [5, 10])]
C0 | C1 | ||
---|---|---|---|
int32 | int32 | ||
0 | 0 | 0 | |
1 | 2 | 4 | |
2 | 4 | 9 |
Bin two-column frame by specifying edges of the binning intervals:
from datatable import dt, f, cut
DT = dt.Frame([[1, 3, 5], [0.1, -0.1, 1.5]])
BINS = [dt.Frame(range(10)), dt.Frame([-1.0, 0, 1.0, 2.0])]
DT[:, cut(f[:], bins = BINS)]
C0 | C1 | ||
---|---|---|---|
int32 | int32 | ||
0 | 0 | 1 | |
1 | 2 | 0 | |
2 | 4 | 2 |
See also¶
qcut()
– function for equal-population binning.
The content on this page is licensed under the Creative Commons Attribution 4.0 License
(CC BY 4.0) ,
and code samples are licensed under the MIT License.