Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal error: DT passed to assign has not been allocated enough column slots #4100

Open
tyner opened this issue Dec 10, 2019 · 2 comments · May be fixed by #5269
Open

Internal error: DT passed to assign has not been allocated enough column slots #4100

tyner opened this issue Dec 10, 2019 · 2 comments · May be fixed by #5269
Labels
top request One of our most-requested issues
Milestone

Comments

@tyner
Copy link

tyner commented Dec 10, 2019

It appears that set() does not always allocate enough slots. Example:

library(data.table)
DT = data.table(a = runif(10))

my.set = function(x, i = NULL, j, value) {
    if (truelength(DT) < length(DT)) {
       stop("bad input")
    }
    if (truelength(DT) == length(DT)) {
        cat("do we need to call alloc.col or setDT here?\n")
    }
    # note: switching from set() to DT[, c(j) := value] works here
    set(x, i, j, value)
    
    if (truelength(DT) < length(DT)) {
       stop("bad output")
    }
    invisible(x)
}

set.seed(6860)
while(ncol(DT) < 10000L){

    new.name = paste("V", ncol(DT) + 1L)
    new.value = sample(nrow(DT))
    
    my.set(DT, j = new.name, value = new.value)
}

As indicated in the comment above, switching from "set(x, i, j, value)" to "DT[, c(j) := value]" makes it work. The situation seems similar as to what was reported in #1830.

sessionInfo() is:

R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS: /home/btyner/R360/lib64/R/lib/libRblas.so
LAPACK: /home/btyner/R360/lib64/R/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.6.0

@DavorJ
Copy link

DavorJ commented Jul 10, 2023

This is still an issue in 1.14.8:

Here is a mini-example for a unit test:

# options('datatable.alloccol' = 1024L)
df <- data.table::data.table(x = 1)

for (i in 1:2000) data.table::set(df, j = as.character(i), value = 1)

# Error in data.table::set(df, j = as.character(i), value = 1) :   
# Internal error: DT passed to assign has not been allocated enough column slots. l=1025, tl=1025, adding 1

Adjusting datatable.alloccol resolves the issue.

@DavorJ
Copy link

DavorJ commented Jul 10, 2023

setalloccol is executed at the end of data.table() with default argument from options('datatable.alloccol').

Would adding alloccol as an argument to data.table() be a viable option, as a way to override the default? This would also draw attention to the relevant documentation as requested in #1831.

Currently, one has two options: either adjust global options before making a DT, or execute setalloccol again on the created DT.

@jangorecki jangorecki modified the milestones: 1.14.11, 1.15.1 Oct 29, 2023
@MichaelChirico MichaelChirico added the top request One of our most-requested issues label Apr 14, 2024
@MichaelChirico MichaelChirico modified the milestones: 1.16.0, 1.17.0 Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
top request One of our most-requested issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants