Ftrl

class datatable.models.Ftrl

Follow the Regularized Leader (FTRL) model.

FTRL model is a datatable implementation of the FTRL-Proximal online learning algorithm for binomial logistic regression. It uses a hashing trick for feature vectorization and the Hogwild approach for parallelization. FTRL for multinomial classification and continuous targets are implemented experimentally.

See this reference for more details: https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf

Parameters:
  • alpha (float) – alpha in per-coordinate learning rate algorithm, defaults to 0.005.
  • beta (float) – beta in per-coordinate learning rate algorithm, defaults to 1.
  • lambda1 (float) – L1 regularization parameter, defaults to 0.
  • lambda2 (float) – L2 regularization parameter, defaults to 0.
  • nbins (int) – Number of bins to be used for the hashing trick, defaults to 10**6.
  • mantissa_nbits (int) – Number of bits from mantissa to be used for hashing, defaults to 10.
  • nepochs (int) – Number of training epochs, defaults to 1.
  • double_precision (bool) – Whether to use double precision arithmetic or not, defaults to False.
  • negative_class (bool) – Whether to create and train on a “negative” class in the case of multinomial classification.
alpha

alpha in per-coordinate FTRL-Proximal algorithm

beta

beta in per-coordinate FTRL-Proximal algorithm

colname_hashes

Column name hashes

colnames

Column names

double_precision

Whether to use double precision arithmetic for modeling

feature_importances

Two-column frame with feature names and the corresponding feature importances normalized to [0; 1].

fit()

Train FTRL model on a dataset.

Parameters:
  • X_train (Frame) – Training frame of shape (nrows, ncols).
  • y_train (Frame) – Target frame of shape (nrows, 1).
  • X_validation (Frame) – Validation frame of shape (nrows, ncols).
  • y_validation (Frame) – Validation target frame of shape (nrows, 1).
  • nepochs_validation (float) – Parameter that specifies how often, in epoch units, validation error should be checked.
  • validation_error (float) – If within nepochs_validation relative validation error does not improve by at least validation_error, training stops.
  • validation_average_niterations (int) – Number of iterations that is used to calculate average loss. Here, each iteration corresponds to nepochs_validation epochs.
Returns:

  • A tuple consisting of two elements (epoch and loss, where)
  • epoch is the epoch at which model fitting stopped, and loss is the final
  • loss. When validation dataset is not provided, epoch returned is equal to
  • nepochs, and loss is float(‘nan’).

interactions

List of feature lists to do interactions for

labels

Frame of labels used for classification.

lambda1

L1 regularization parameter

lambda2

L2 regularization parameter

mantissa_nbits

Number of bits from mantissa to be used for hashing float values

model

Model frame of shape (nbins, 2 * nlabels), where nlabels is the total number of labels the model was trained on, and nbins is the number of bins used for the hashing trick. Odd frame columns contain z model coefficients, and even columns n model coefficients.

model_type

‘auto’, ‘regression’, ‘binomial’ or ‘multinomial.

Type:FTRL model type
model_type_trained

‘none’, ‘regression’, ‘binomial’ or ‘multinomial.

Type:FTRL trained model type
nbins

Number of bins for the hashing trick

negative_class

Whether to train on negatives in the case of multinomial classification.

nepochs

Number of epochs to train a model

params

FTRL model parameters

predict()

Make predictions for a dataset.

Parameters:X (Frame) – Frame of shape (nrows, ncols) to make predictions for. It should have the same number of columns as the training frame.
Returns:
  • A new frame of shape (nrows, nlabels) with the predicted probabilities
  • for each row of frame X and each label the model was trained for.
reset()

Reset FTRL model by clearing all the model weights, labels and feature importance information.

Parameters:None
Returns:
Return type:None