Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.

## Installation

remotes::install_github("mlr-org/mlr3")

## Example

library(mlr3)
set.seed(1)

task_iris = TaskClassif$new(id = "iris", backend = iris, target = "Species") task_iris ## <TaskClassif:iris> (150 x 5) ## * Target: Species ## * Properties: multiclass ## * Features (4): ## - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width # load learner and set hyperparamter learner = lrn("classif.rpart", cp = 0.01) ### Basic train + predict # train/test split train_set = sample(task_iris$nrow, 0.8 * task_iris$nrow) test_set = setdiff(seq_len(task_iris$nrow), train_set)

# train the model
learner$train(task_iris, row_ids = train_set) # predict data prediction = learner$predict(task_iris, row_ids = test_set)

# calculate performance
prediction$confusion ## truth ## response setosa versicolor virginica ## setosa 11 0 0 ## versicolor 0 12 1 ## virginica 0 0 6 measure = msr("classif.acc") prediction$score(measure)
## classif.acc
##   0.9666667

### Resample

# automatic resampling
resampling = rsmp("cv", folds = 3L)
rr = resample(task_iris, learner, resampling)
## INFO  [15:50:38.817] Applying learner 'classif.rpart' on task 'iris' (iter 1/3)
## INFO  [15:50:38.843] Applying learner 'classif.rpart' on task 'iris' (iter 2/3)
## INFO  [15:50:38.861] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
rr$score(measure) ## task task_id learner learner_id ## <list> <char> <list> <char> ## 1: <TaskClassif> iris <LearnerClassifRpart> classif.rpart ## 2: <TaskClassif> iris <LearnerClassifRpart> classif.rpart ## 3: <TaskClassif> iris <LearnerClassifRpart> classif.rpart ## resampling resampling_id iteration prediction classif.acc ## <list> <char> <int> <list> <num> ## 1: <ResamplingCV> cv 1 <list> 0.92 ## 2: <ResamplingCV> cv 2 <list> 0.92 ## 3: <ResamplingCV> cv 3 <list> 0.94 rr$aggregate(measure)
## classif.acc
##   0.9266667

## Why a rewrite?

mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.

## Design principles

• Only the basic building blocks for machine learning are implemented in this package.
• Focus on computation here. No visualization or other stuff. That can go in extra packages.
• Overcome the limitations of R’s S3 classes with the help of R6.
• Embrace R6, clean OO-design, object state-changes and reference semantics. This might be less “traditional R”, but seems to fit mlr nicely.
• Embrace data.table for fast and convenient data frame computations.
• Combine data.table and R6, for this we will make heavy use of list columns in data.tables.
• Be light on dependencies. mlr3 requires the following packages at runtime:
• Reflections: Objects are queryable for properties and capabilities, allowing you to program on them.
• Additional functionality that comes with extra dependencies:

# Talks, Workshops, etc.

mlr-outreach holds all outreach activities related to mlr and mlr3.

mlr3 talk at useR! 2019 conference in Toulouse, France: