This is the abstract base class for data backends.

Data Backends provide a layer of abstraction for various data storage systems. The required set of operations to implement is listed in the Methods section.

Note that all data access is handled transparently via the Task. It is not recommended to work directly with the DataBackend.

See DataBackendDataTable or DataBackendMatrix for exemplary implementations of this interface.

Format

R6::R6Class object.

Construction

Note: This object is typically constructed via a derived classes, e.g. DataBackendDataTable or DataBackendMatrix, or via the S3 method as_data_backend().

DataBackend$new(data, primary_key = NULL, data_formats = "data.table", converters = list())

Fields

  • nrow :: integer(1)
    Number of rows (observations).

  • ncol :: integer(1)
    Number of columns (variables), including the primary key column.

  • colnames :: character()
    Returns vector of all column names, including the primary key column.

  • rownames :: integer() | character()
    Returns vector of all distinct row identifiers, i.e. the primary key column.

  • hash :: character(1)
    Returns a unique hash for this backend. This hash is cached.

  • data_formats :: character()
    Vector of supported data formats. A specific format of these supported formats can be picked in the $data() method.

Methods

See also

Examples

data = data.table::data.table(id = 1:5, x = runif(5), y = sample(letters[1:3], 5, replace = TRUE)) b = DataBackendDataTable$new(data, primary_key = "id") print(b)
#> <DataBackendDataTable> (5x3) #> id x y #> 1: 1 0.4089440 a #> 2: 2 0.8209513 c #> 3: 3 0.9188573 b #> 4: 4 0.2825283 a #> 5: 5 0.9611048 a
b$head(2)
#> id x y #> 1: 1 0.4089440 a #> 2: 2 0.8209513 c
b$data(rows = 1:2, cols = "x")
#> x #> 1: 0.4089440 #> 2: 0.8209513
b$distinct(rows = b$rownames, "y")
#> $y #> [1] "a" "c" "b" #>
b$missings(rows = b$rownames, cols = names(data))
#> id x y #> 0 0 0