datatable.qcut()¶
Added in version 0.11
Bin all the columns from cols
into intervals with approximately
equal populations. Thus, the intervals are chosen according to
the sample quantiles of the data.
If there are duplicate values in the data, they will all be placed into the same bin. In extreme cases this may cause the bins to be highly unbalanced.
Parameters¶
cols
FExpr
Input data for quantile binning.
nquantiles
int
| List[int]
return
FExpr
f-expression that converts input columns into the columns filled with the respective quantile ids.
Examples¶
Bin two-column frame by using the same number of quantiles for both columns:
from datatable import dt, f, qcut
DT = dt.Frame([range(5), [3, 14, 15, 92, 6]])
DT[:, qcut(f[:], nquantiles = 3)]
C0 | C1 | ||
---|---|---|---|
int32 | int32 | ||
0 | 0 | 0 | |
1 | 0 | 1 | |
2 | 1 | 2 | |
3 | 2 | 2 | |
4 | 2 | 0 |
Bin two-column frame by using column-specific number of quantiles:
from datatable import dt, f, qcut
DT = dt.Frame([range(5), [3, 14, 15, 92, 6]])
DT[:, qcut(f[:], nquantiles = [3, 5])]
C0 | C1 | ||
---|---|---|---|
int32 | int32 | ||
0 | 0 | 0 | |
1 | 0 | 2 | |
2 | 1 | 3 | |
3 | 2 | 4 | |
4 | 2 | 1 |
See also¶
cut()
– function for equal-width interval binning.
The content on this page is licensed under the Creative Commons Attribution 4.0 License
(CC BY 4.0) ,
and code samples are licensed under the MIT License.