Perform randomized k-fold split of data with
nrows rows into
nsplits train/test subsets. The dataset itself is not passed to this
function: it is sufficient to know only the number of rows in order to decide
how the data should be split.
The train/test subsets produced by this function will have the following properties:
all test folds will be of approximately the same size
all observations have equal ex-ante chance of getting assigned into each fold;
the row indices in all train and test folds will be sorted.
The function uses single-pass parallelized algorithm to construct the folds.
The number of rows in the frame that you want to split.
Number of folds, must be at least
2, but not larger than
Seed value for the random number generator used by this function. Calling the function several times with the same seed values will produce same results each time.
kfold() – Perform k-fold split.