This is the abstract base class for data backends. See DataBackendDataTable or DataBackendMatrix for exemplary implementations of this interface.

Data Backends provide a layer of abstraction for various data storage systems. The required set of operations to implement is listed in the Methods section.

Note that all data access is handled transparently via the Task. It is not recommended to work directly with the DataBackend.

Format

R6::R6Class object.

Construction

DataBackend$new(data, primary_key = NULL)

Fields

  • nrow :: integer(1)
    Number of rows (observations).

  • ncol :: integer(1)
    Number of columns (variables), including the primary key column.

  • colnames :: character()
    Returns vector of all column names, including the primary key column.

  • rownames :: integer() | character()
    Returns vector of all distinct row identifiers, i.e. the primary key column.

Methods

See also

Examples

data = data.table::data.table(id = 1:5, x = runif(5), y = sample(letters[1:3], 5, replace = TRUE)) b = DataBackendDataTable$new(data, primary_key = "id") print(b)
#> <DataBackendDataTable> (5x3) #> #> Public: colnames, compact_seq, data(), distinct(), formats, hash, #> head(), missing(), ncol, nrow, primary_key, rownames #> id x y #> 1: 1 0.9566087 a #> 2: 2 0.8444304 b #> 3: 3 0.2227686 b #> 4: 4 0.4201702 a #> 5: 5 0.4199829 c
b$head(2)
#> id x y #> 1: 1 0.9566087 a #> 2: 2 0.8444304 b
b$data(rows = 1:2, cols = "x")
#> x #> 1: 0.9566087 #> 2: 0.8444304
b$distinct("y")
#> $y #> [1] "a" "b" "c" #>
b$missing(rows = b$rownames, cols = names(data))
#> id x y #> 0 0 0