Prerequisites

library(mlr3)

Objects of class mlr3::Learner provide a unified interface to many popular machine learning algorithms in R. They consist of methods to train and predict on a mlr3::Task, and additionally provide meta information about the algorithms.

The package ships with only a rather minimal set of classification and regression learners, more are implemented in the mlr3learners package. Furthermore, mlr3learners has some documentation on creating custom learners.

Predefined Learners

Analogously to mlr3::mlr_tasks, the mlr3::Dictionary mlr3::mlr_learners can be queried for available learners:

As listed in the output, each learner comes with the following annotations:

  • feature_types: what kind of features can be processed.
  • packages: which packages are required to run train() and predict().
  • properties: additional properties and capabilities. E.g., a learner has the property “missings” if it is able to handle missing values natively, and “importance” if it is possible to extract feature importance values.
  • predict_types: what predict types are possible. E.g., a classification learner can predict labels (“response”) or probabilities (“prob”)

To extract a specific learner, use the corresponding “id”:

As the printer shows, all information from the previous table is also accessible via public fields (id, feature_types, packages, properties, predict_types) Additionally, predict_type returns the currently selected predict type of the learner.

The field param_set stores a description of hyperparameter settings:

The set of hyperparamter values is stored inside the parameter set in the field values. By assigning a named list to this field, we change the active hyperparameters of the learner:

The field model stores the result of the training step. As we have not yet learned a model, this is NULL:

Train and Predict

Is recommended to train the learner via the mlr3::Experiment class, we only train the learner here directly to showcase more of its API.

First, we retrieve the “iris” task from the mlr3::mlr_tasks dictionary, and then apply the train method on the complete task.

The learner returns itself, the fitted model is stored in the field “model”:

Next, we generate predictions on the complete iris data set:

The returned mlr3::PredictionClassif object stores the predicted labels (“response”) as well as the true labels (“truth”).

By simply counting the number of correct predictions and dividing by the number of observations, we can calculate the accuracy:

tab = as.data.table(predictions)
mean(tab$response == tab$truth)
#> [1] 0.96

Note that this measure is over-optimistic. As we did not used an independent test set, this is the re-substitution error.