ResultData

Internal object to store results in list of data.tables, arranged in a star schema. It is advised to not directly work on this data structure as it may be changed in the future without further warnings.

The main motivation of this data structure is the necessity to avoid storing duplicated R6 objects. While this is usually no problem in a single R session, serialization via serialize() (which is used in save()/saveRDS() or during parallelization) leads to objects with unreasonable memory requirements.

Public fields

data: (list())
List of data.table::data.table(), arranged in a star schema. Do not operate directly on this list.

Active bindings

task_type: (character(1))
Returns the task type of stored objects, e.g. "classif" or "regr". Returns NULL if the ResultData is empty.

Methods

Method `new()`

Creates a new instance of this R6 class. An alternative construction method is provided by as_result_data().

Usage

ResultData$new(data = NULL, data_extra = NULL, store_backends = TRUE)

Arguments

data: (data.table::data.table()) | NULL)
Do not initialize this object yourself, use as_result_data() instead.
data_extra: (list())
Additional data to store. This can be used to store additional information for each iteration.
store_backends: (logical(1))
If set to FALSE, the backends of the Tasks provided in data are removed.

Method `uhashes()`

Returns all unique hashes (uhash values) of all included ResampleResults.

Usage

ResultData$uhashes(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

character().

Method `uhash_table()`

Returns a data.table with columns uhash, learner_id, task_id and resampling_id for the given view. The uhash uniquely identifies an individual ResampleResult.

Usage

ResultData$uhash_table(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

data.table()

Method `iterations()`

Returns the number of recorded iterations / experiments.

Usage

ResultData$iterations(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

integer(1).

Method `tasks()`

Returns a table of included Tasks.

Usage

ResultData$tasks(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

data.table() with columns "task_hash" (character()) and "task" (Task).

Method `learners()`

Returns a table of included Learners.

Usage

ResultData$learners(view = NULL, states = TRUE, reassemble = TRUE)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.
states: (logical(1))
If TRUE, returns a learner for each iteration/experiment in the ResultData object. If FALSE, returns an exemplary learner (without state) for each ResampleResult.
reassemble: (logical(1))
Reassemble the learners, i.e. re-set the state and the hyperparameters which are stored separately before returning the learners.

Returns

data.table() with columns "learner_hash" (character()) and "learner" (Learner).

Method `learner_states()`

Returns a list of states of included Learners without reassembling the learners.

@return list of list()

Usage

ResultData$learner_states(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Method `resamplings()`

Returns a table of included Resamplings.

Usage

ResultData$resamplings(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

data.table() with columns "resampling_hash" (character()) and "resampling" (Resampling).

Method `predictions()`

Returns a list of Prediction objects.

Usage

ResultData$predictions(view = NULL, predict_sets = "test")

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".

Returns

list() of Prediction.

Method `prediction()`

Returns a combined Prediction objects.

Usage

ResultData$prediction(view = NULL, predict_sets = "test")

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".

Returns

Prediction.

Method `data_extra()`

Returns additional data stored.

Usage

ResultData$data_extra(view = NULL)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.

Returns

data.table::data.table().

Method `combine()`

Combines multiple ResultData objects, modifying self in-place.

Usage

ResultData$combine(rdata)

Arguments

rdata: (ResultData).

Returns

self (invisibly).

Method `sweep()`

Updates the ResultData object, removing rows from all tables which are not referenced by the fact table anymore. E.g., can be called after filtering/subsetting the fact table.

Usage

ResultData$sweep()

Returns

Modified self (invisibly).

Method `marshal()`

Marshals all stored learner models. This will do nothing to models that are already marshaled.

Usage

ResultData$marshal(...)

Arguments

...: (any)
Additional arguments passed to marshal_model().

Method `unmarshal()`

Unmarshals all stored learner models. This will do nothing to models which are not marshaled.

Usage

ResultData$unmarshal(...)

Arguments

...: (any)
Additional arguments passed to unmarshal_model().

Method `discard()`

Shrinks the object by discarding parts of the stored data.

Usage

ResultData$discard(backends = FALSE, models = FALSE)

Arguments

backends: (logical(1))
If TRUE, the DataBackend is removed from all stored Tasks.
models: (logical(1))
If TRUE, the stored model is removed from all Learners.

Returns

Modified self (invisibly).

Method `as_data_table()`

Combines internal tables into a single flat data.table().

Usage

ResultData$as_data_table(
  view = NULL,
  reassemble_learners = TRUE,
  convert_predictions = TRUE,
  predict_sets = "test"
)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.
reassemble_learners: (logical(1))
Reassemble the tasks?
convert_predictions: (logical(1))
Convert PredictionData to Prediction?
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".
predict_sets: (character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test", "internal_valid"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".

Method `logs()`

Get a table of recorded learner logs.

Usage

ResultData$logs(view = NULL, condition)

Arguments

view: character(1)
Single uhash to restrict the results to.
view: character(1)
Single uhash to restrict the results to.
condition: (character(1)) The condition to extract. One of "message", "warning" or "error".

Returns

data.table::data.table().

Method `set_threshold()`

Sets the threshold for the response prediction of classification learners, given they have output a probability prediction.

Usage

ResultData$set_threshold(view = NULL, threshold, ties_method = "random")

Arguments

view

character(1)
Single uhash to restrict the results to.

view

character(1)
Single uhash to restrict the results to.

threshold

(numeric(1))
Threshold value.

ties_method

(character(1))
Method to handle ties in probabilities when selecting a class label. Must be one of "random", "first" or "last" (corresponding to the same options in max.col()).

"random": Randomly select one of the tied class labels (default).
"first": Select the first class label among tied values.
"last": Select the last class label among tied values.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

ResultData$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# table overview
print(ResultData$new()$data)
#> $fact
#> Key: <uhash, iteration>
#> Empty data.table (0 rows and 8 cols): uhash,iteration,learner_state,prediction,learner_hash,task_hash...
#> 
#> $uhashes
#> Empty data.table (0 rows and 1 cols): uhash
#> 
#> $tasks
#> Key: <task_hash>
#> Empty data.table (0 rows and 2 cols): task_hash,task
#> 
#> $learners
#> Key: <learner_phash>
#> Empty data.table (0 rows and 2 cols): learner_phash,learner
#> 
#> $resamplings
#> Key: <resampling_hash>
#> Empty data.table (0 rows and 2 cols): resampling_hash,resampling
#> 
#> $learner_components
#> Key: <learner_hash>
#> Empty data.table (0 rows and 2 cols): learner_hash,learner_param_vals
#> 
#> $data_extras
#> Key: <uhash, iteration>
#> Empty data.table (0 rows and 3 cols): uhash,iteration,data_extra
#>

Public fields

Active bindings

Methods

Public methods

Method new()

Usage

Arguments

Method uhashes()

Usage

Arguments

Returns

Method uhash_table()

Usage

Arguments

Returns

Method iterations()

Usage

Arguments

Returns

Method tasks()

Usage

Arguments

Returns

Method learners()

Usage

Arguments

Returns

Method learner_states()

Usage

Arguments

Method resamplings()

Usage

Arguments

Returns

Method predictions()

Usage

Arguments

Returns

Method prediction()

Usage

Arguments

Returns

Method data_extra()

Usage

Arguments

Returns

Method combine()

Usage

Arguments

Returns

Method sweep()

Usage

Returns

Method marshal()

Usage

Arguments

Method unmarshal()

Usage

Arguments

Method discard()

Usage

Arguments

Returns

Method as_data_table()

Usage

Arguments

Method logs()

Usage

Arguments

Returns

Method set_threshold()

Usage

Arguments

Method clone()

Usage

Arguments

Examples

Method `new()`

Method `uhashes()`

Method `uhash_table()`

Method `iterations()`

Method `tasks()`

Method `learners()`

Method `learner_states()`

Method `resamplings()`

Method `predictions()`

Method `prediction()`

Method `data_extra()`

Method `combine()`

Method `sweep()`

Method `marshal()`

Method `unmarshal()`

Method `discard()`

Method `as_data_table()`

Method `logs()`

Method `set_threshold()`

Method `clone()`