Ftrl¶

class
datatable.models.
Ftrl
¶ Follow the Regularized Leader (FTRL) model.
FTRL model is a datatable implementation of the FTRLProximal online learning algorithm for binomial logistic regression. It uses a hashing trick for feature vectorization and the Hogwild approach for parallelization. Multinomial classification and regression for continuous targets are implemented experimentally.
See this reference for more details: https://www.eecs.tufts.edu/~dsculley/papers/adclickprediction.pdf
Parameters:  alpha (float) – alpha in percoordinate learning rate formula, defaults to 0.005.
 beta (float) – beta in percoordinate learning rate formula, defaults to 1.
 lambda1 (float) – L1 regularization parameter, defaults to 0.
 lambda2 (float) – L2 regularization parameter, defaults to 0.
 nbins (int) – Number of bins to be used for the hashing trick, defaults to 10**6.
 mantissa_nbits (int) – Number of bits from mantissa to be used for hashing floats, defaults to 10.
 nepochs (int) – Number of training epochs, defaults to 1.
 double_precision (bool) – Whether to use double precision arithmetic or not, defaults to False.
 negative_class (bool) – Whether to create and train on a ‘negative’ class in the case of multinomial classification.
 interactions (list or tuple) – A list or a tuple of interactions. In turn, each interaction should be a list or a tuple of feature names, where each feature name is a column name from the training frame.
 model_type (str) – Model type can be one of the following: ‘binomial’ for binomial classification, ‘multinomial’ for multinomial classification, and ‘regression’ for numeric regression. Defaults to ‘auto’, meaning that the model type will be automatically selected based on the target column stype.

alpha
¶ alpha in percoordinate learning rate formula.

beta
¶ beta in percoordinate learning rate formula.

colname_hashes
¶ Column name hashes.

colnames
¶ Column names.

double_precision
¶ Whether to use double precision arithmetic or not.

feature_importances
¶ Twocolumn frame with feature names and the corresponding feature importances normalized to [0; 1].

fit
()¶ Train FTRL model on a dataset.
Parameters:  X_train (Frame) – Training frame of shape (nrows, ncols).
 y_train (Frame) – Target frame of shape (nrows, 1).
 X_validation (Frame) – Validation frame of shape (nrows, ncols).
 y_validation (Frame) – Validation target frame of shape (nrows, 1).
 nepochs_validation (float) – Parameter that specifies how often, in epoch units, validation error should be checked.
 validation_error (float) – If within nepochs_validation relative validation error does not improve by at least validation_error, training stops.
 validation_average_niterations (int) – Number of iterations that is used to calculate average loss. Here, each iteration corresponds to nepochs_validation epochs.
Returns:  A tuple consisting of two elements (epoch and loss, where)
 epoch is the epoch at which model fitting stopped, and loss is the final
 loss. When validation dataset is not provided, epoch returned is equal to
 nepochs, and loss is float(‘nan’).

interactions
¶ A list or a tuple of interactions. In turn, each interaction should be a list or a tuple of feature names, where each feature name is a column name from the training frame.

labels
¶ Frame of labels used for classification.

lambda1
¶ L1 regularization parameter.

lambda2
¶ L2 regularization parameter.

mantissa_nbits
¶ Number of bits from mantissa to be used for hashing floats.

model
¶ Model frame of shape (nbins, 2 * nlabels), where nlabels is the total number of labels the model was trained on, and nbins is the number of bins used for the hashing trick. Odd frame columns contain z model coefficients, and even columns n model coefficients.

model_type
¶ ‘binomial’ for binomial classification, ‘multinomial’ for multinomial classification, ‘regression’ for numeric regression or ‘auto’ for automatic model type detection based on the target column stype. Default value is ‘auto’.
Type: The type of the model FTRL should build

model_type_trained
¶ ‘regression’, ‘binomial’, ‘multinomial’ or ‘none’ for untrained model.
Type: The model type FTRL has built

nbins
¶ Number of bins to be used for the hashing trick.

negative_class
¶ Whether to create and train on a ‘negative’ class in the case of multinomial classification.

nepochs
¶ Number of training epochs.

params
¶ FTRL model parameters.

predict
()¶ Make predictions for a dataset.
Parameters: X (Frame) – Frame of shape (nrows, ncols) to make predictions for. It should have the same number of columns as the training frame. Returns:  A new frame of shape (nrows, nlabels) with the predicted probabilities
 for each row of frame X and each label the model was trained for.

reset
()¶ Reset FTRL model by clearing all the model weights, labels and feature importance information.
Parameters: None – Returns: Return type: None