Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revdep simtrial test ERROR #5826

Closed
tdhock opened this issue Dec 13, 2023 · 9 comments · Fixed by #5881
Closed

revdep simtrial test ERROR #5826

tdhock opened this issue Dec 13, 2023 · 9 comments · Fixed by #5881
Labels
revdep Reverse dependencies
Milestone

Comments

@tdhock
Copy link
Member

tdhock commented Dec 13, 2023

revdep simtrial recently submitted an update which shows the error below using data.table master (but not with data.table from CRAN).

* checking tests ...
  Running 'testthat.R'
 ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
  == Failed tests ================================================================
  -- Failure ('test-independent_test_pvalue_maxcombo.R:83:3'): the p-values correspond to pvalue_maxcombo --
  `p1` not equal to `p2`.
  1/1 mismatches
  [1] 1 - 0.0716 == 0.928
  -- Failure ('test-independent_test_wlr.R:25:3'): the z values match with the correspondings in fh_weight --
  c(z1[1], z1[7:9]) not equal to `z2`.
  3/4 mismatches (average diff: 5.54)
  [1] 5.67 - -0.1777 == 5.85
  [2] 4.77 - -0.7132 == 5.48
  [3] 5.26 - -0.0381 == 5.30
  
  [ FAIL 2 | WARN 0 | SKIP 0 | PASS 182 ]
  Error: Test failures
  Execution halted

above output indicates the errors are numerical, which seems odd. But I double checked on another machine, the test failure is real.

git bisect says this PR is the cause #4470 which seems to be related to .SDcols, which seems inconsistent with the numerical errors in the test failure. I don't understand, but maybe @jangorecki and @ColeMiller1 could comment/investigate please? (you authored/commented on that PR)

test source code is on github

@tdhock tdhock added the revdep Reverse dependencies label Dec 13, 2023
@jangorecki
Copy link
Member

jangorecki commented Dec 14, 2023

Thanks for reporting
I had a look at
https:/Merck/simtrial/blob/93b91e1b34eadedb7f13a6de6507443f5c743df0/R/fh_weight.R#L144
https:/Merck/simtrial/blob/93b91e1b34eadedb7f13a6de6507443f5c743df0/R/fh_weight.R#L235
https:/Merck/simtrial/blob/93b91e1b34eadedb7f13a6de6507443f5c743df0/R/pvalue_maxcombo.R#L70
And don't see any use of .SD there. There are some usages of [with=F without explicitly setting with=FALSE, which feels it could be related.

The next step probably would be to detect at which point different value is appearing, which could/should involve maintainer of the package.

@ColeMiller1
Copy link
Contributor

I had a look - it looks like it's actually the package survMisc that causes the problem. The call is to ten.data.frame.

https:/cran/survMisc/blob/master/R/ten.R

I could not see any peeling away of parentheses. I also don't think the other two changes (error when logical vector does not match length of names and the edge case of :) seem to be in play. I am not sure if I will have more time to dig, but I wouldn't be surprised if it was related to #5084 .

@tdhock
Copy link
Member Author

tdhock commented Jan 4, 2024

https:/dardisco/survMisc has version 0.5.1 (2019) whereas CRAN is version 0.5.6 (2022) which suggests the author is no longer using that github repo.

@tdhock
Copy link
Member Author

tdhock commented Jan 4, 2024

I confirm the observation of @ColeMiller1 --- the code in simtrial does not give any different results using data.table master, but the test in simtrial compares with a result computed by survMisc, which does have a different result using data.table master. To demonstrate this, I modified the test https:/Merck/simtrial/blob/main/tests/testthat/test-independent_test_wlr.R#L25 -- I put the following in survMisc-bug.R:

set.seed(1234)
y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
adjust.methods <- "asymp"
wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
ties.method <- "efron"
one.sided <- TRUE
HT.est <- FALSE
max <- TRUE
alpha <- 0.025
data.anal <- data.frame(cbind(y$tte, y$event, y$treatment))
fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
# Testing
survMisc::comp(fit, p = sapply(wt, function(x) {
  x[1]
}), q = sapply(wt, function(x) {
  x[2]
}))
tst.rslt <- attr(fit, "lrt")
z1 <- tst.rslt$Z
a2 <- y |> simtrial::counting_process(arm = "experimental")
aa <- simtrial::fh_weight(a2, rho_gamma = data.frame(rho = c(0, 0, 1, 1), gamma = c(0, 1, 0, 1)))
result.simtrial <- aa$z
result.survMisc <- c(z1[1], z1[7:9])
print(rbind(result.simtrial, result.survMisc))
packageVersion("data.table")
testthat::expect_equal(result.survMisc, result.simtrial, tolerance = 0.00001)

Then I run the following

  • First run survMisc-bug.R using data.table master -- test fails
  • then run survMisc-bug.R using data.table from CRAN -- test passes
(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> adjust.methods <- "asymp"
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> ties.method <- "efron"
> one.sided <- TRUE
> HT.est <- FALSE
> max <- TRUE
> alpha <- 0.025
> data.anal <- data.frame(cbind(y$tte, y$event, y$treatment))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> # Testing
> survMisc::comp(fit, p = sapply(wt, function(x) {
+   x[1]
+ }), q = sapply(wt, function(x) {
+   x[2]
+ }))
                     Q         Var         Z pNorm
1             15.51395     7.47802  5.673215     4
n            970.00000 37994.00000  4.976388     9
sqrtN          0.22308   496.31065  0.010014     3
S1            12.66143     5.12921  5.590591     8
S2            11.71357     4.96894  5.254815     6
FH_p=0_q=0    -0.48605     7.47802 -0.177740     2
FH_p=0_q=1     2.66342     0.31232  4.765805     7
FH_p=1_q=0    12.08527     5.27276  5.263051     5
FH_p=1_q=1    -0.26598     0.15960 -0.665792     1
              maxAbsZ        Var       Q pSupBr
1          1.5514e+01 7.4780e+00 5.67321      5
n          9.7000e+02 3.7994e+04 4.97639      4
sqrtN      2.0003e+01 4.9631e+02 0.89789      1
S1         1.2661e+01 5.1292e+00 5.59059      9
S2         1.1714e+01 4.9689e+00 5.25481      7
FH_p=0_q=0 2.0428e+00 7.4780e+00 0.74702      2
FH_p=0_q=1 2.6634e+00 3.1232e-01 4.76580      8
FH_p=1_q=0 1.2085e+01 5.2728e+00 5.26305      6
FH_p=1_q=1 2.6598e-01 1.5960e-01 0.66579      3
> tst.rslt <- attr(fit, "lrt")
> z1 <- tst.rslt$Z
> a2 <- y |> simtrial::counting_process(arm = "experimental")
> aa <- simtrial::fh_weight(a2, rho_gamma = data.frame(rho = c(0, 0, 1, 1), gamma = c(0, 1, 0, 1)))
> result.simtrial <- aa$z
> result.survMisc <- c(z1[1], z1[7:9])
> print(rbind(result.simtrial, result.survMisc))
                      [,1]       [,2]        [,3]       [,4]
result.simtrial -0.1777396 -0.7132094 -0.03808899 -0.6657923
result.survMisc  5.6732146  4.7658047  5.26305082 -0.6657923
> packageVersion("data.table")
[1] ‘1.14.99’
> testthat::expect_equal(result.survMisc, result.simtrial, tolerance = 0.00001)
Erreur : `result.survMisc` not equal to `result.simtrial`.
3/4 mismatches (average diff: 5.54)
[1] 5.67 - -0.1777 == 5.85
[2] 4.77 - -0.7132 == 5.48
[3] 5.26 - -0.0381 == 5.30
Exécution arrêtée
(base) tdhock@tdhock-MacBook:~/R$ R CMD INSTALL data.table_1.14.6.tar.gz 
Le chargement a nécessité le package : grDevices
* installing to library ‘/home/tdhock/lib/R/library’
* installing *source* package ‘data.table’ ...
** package ‘data.table’ correctement décompressé et sommes MD5 vérifiées
** using staged installation
gcc 12.3.0
zlib 1.2.11 is available ok
R CMD SHLIB supports OpenMP without any extra hint
** libs
using C compiler: ‘gcc (GCC) 12.3.0’
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c assign.c -o assign.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c between.c -o between.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c bmerge.c -o bmerge.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c chmatch.c -o chmatch.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c cj.c -o cj.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c coalesce.c -o coalesce.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c dogroups.c -o dogroups.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fastmean.c -o fastmean.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fcast.c -o fcast.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fifelse.c -o fifelse.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fmelt.c -o fmelt.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c forder.c -o forder.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c frank.c -o frank.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fread.c -o fread.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c freadR.c -o freadR.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c froll.c -o froll.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c frollR.c -o frollR.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c frolladaptive.c -o frolladaptive.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fsort.c -o fsort.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fwrite.c -o fwrite.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c fwriteR.c -o fwriteR.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c gsumm.c -o gsumm.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c ijoin.c -o ijoin.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c init.c -o init.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c inrange.c -o inrange.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c nafill.c -o nafill.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c nqrecreateindices.c -o nqrecreateindices.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c openmp-utils.c -o openmp-utils.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c quickselect.c -o quickselect.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c rbindlist.c -o rbindlist.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c reorder.c -o reorder.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c shift.c -o shift.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c snprintf.c -o snprintf.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c subset.c -o subset.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c transpose.c -o transpose.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c types.c -o types.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c uniqlist.c -o uniqlist.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c utils.c -o utils.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c vecseq.c -o vecseq.o
gcc -I"/home/tdhock/lib/R/include" -DNDEBUG   -I/home/tdhock/include -march=core2   -fopenmp  -fpic  -march=core2  -c wrappers.c -o wrappers.o
gcc -shared -L/home/tdhock/lib/R/lib -L/home/tdhock/lib -Wl,-rpath=/home/tdhock/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o ijoin.o init.o inrange.o nafill.o nqrecreateindices.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -fopenmp -lz -L/home/tdhock/lib/R/lib -lR
PKG_CFLAGS = -fopenmp
PKG_LIBS = -fopenmp -lz
if [ "data.table.so" != "data_table.so" ]; then mv data.table.so data_table.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id data_table.so data_table.so; fi
installing to /home/tdhock/lib/R/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Le chargement a nécessité le package : grDevices
** help
*** installing help indices
** building package indices
Le chargement a nécessité le package : grDevices
** installing vignettes
** testing if installed package can be loaded from temporary location
Le chargement a nécessité le package : grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Le chargement a nécessité le package : grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)
(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> adjust.methods <- "asymp"
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> ties.method <- "efron"
> one.sided <- TRUE
> HT.est <- FALSE
> max <- TRUE
> alpha <- 0.025
> data.anal <- data.frame(cbind(y$tte, y$event, y$treatment))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> # Testing
> survMisc::comp(fit, p = sapply(wt, function(x) {
+   x[1]
+ }), q = sapply(wt, function(x) {
+   x[2]
+ }))
                     Q         Var         Z pNorm
1          -4.8605e-01  7.4780e+00 -0.177740     3
n           3.0000e+01  3.7994e+04  0.153909     4
sqrtN       2.2308e-01  4.9631e+02  0.010014     8
S1         -7.4277e-02  5.1292e+00 -0.032796     6
S2         -5.6210e-02  4.9689e+00 -0.025216     7
FH_p=0_q=0 -4.8605e-01  7.4780e+00 -0.177740     3
FH_p=0_q=1 -3.9858e-01  3.1233e-01 -0.713209     1
FH_p=1_q=0 -8.7462e-02  5.2728e+00 -0.038089     5
FH_p=1_q=1 -2.6598e-01  1.5960e-01 -0.665792     2
              maxAbsZ        Var       Q pSupBr
1          2.0428e+00 7.4780e+00 0.74702      6
n          1.9600e+02 3.7994e+04 1.00554      1
sqrtN      2.0003e+01 4.9631e+02 0.89789      2
S1         1.9429e+00 5.1292e+00 0.85789      4
S2         1.9229e+00 4.9689e+00 0.86261      3
FH_p=0_q=0 2.0428e+00 7.4780e+00 0.74702      6
FH_p=0_q=1 3.9858e-01 3.1232e-01 0.71321      7
FH_p=1_q=0 1.9624e+00 5.2728e+00 0.85462      5
FH_p=1_q=1 2.6598e-01 1.5960e-01 0.66579      8
> tst.rslt <- attr(fit, "lrt")
> z1 <- tst.rslt$Z
> a2 <- y |> simtrial::counting_process(arm = "experimental")
> aa <- simtrial::fh_weight(a2, rho_gamma = data.frame(rho = c(0, 0, 1, 1), gamma = c(0, 1, 0, 1)))
> result.simtrial <- aa$z
> result.survMisc <- c(z1[1], z1[7:9])
> print(rbind(result.simtrial, result.survMisc))
                      [,1]       [,2]        [,3]       [,4]
result.simtrial -0.1777396 -0.7132094 -0.03808899 -0.6657923
result.survMisc -0.1777396 -0.7132094 -0.03808899 -0.6657923
> packageVersion("data.table")
[1] ‘1.14.6’
> testthat::expect_equal(result.survMisc, result.simtrial, tolerance = 0.00001)
> 

Compare result above (DT CRAN) with result below (DT master):

> print(rbind(result.simtrial, result.survMisc))
                      [,1]       [,2]        [,3]       [,4]
result.simtrial -0.1777396 -0.7132094 -0.03808899 -0.6657923
result.survMisc  5.6732146  4.7658047  5.26305082 -0.6657923

@tdhock
Copy link
Member Author

tdhock commented Jan 4, 2024

A somewhat smaller survMisc-bug.R code is

set.seed(1234)
y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
comp.res <- survMisc::comp(fit, p = sapply(wt, "[", 1), q = sapply(wt, "[", 2))
attr(fit, "lrt")
packageVersion("data.table")

which gives the following output

(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> comp.res <- survMisc::comp(fit, p = sapply(wt, "[", 1), q = sapply(wt, "[", 2))
                     Q         Var         Z pNorm
1          -4.8605e-01  7.4780e+00 -0.177740     3
n           3.0000e+01  3.7994e+04  0.153909     4
sqrtN       2.2308e-01  4.9631e+02  0.010014     8
S1         -7.4277e-02  5.1292e+00 -0.032796     6
S2         -5.6210e-02  4.9689e+00 -0.025216     7
FH_p=0_q=0 -4.8605e-01  7.4780e+00 -0.177740     3
FH_p=0_q=1 -3.9858e-01  3.1233e-01 -0.713209     1
FH_p=1_q=0 -8.7462e-02  5.2728e+00 -0.038089     5
FH_p=1_q=1 -2.6598e-01  1.5960e-01 -0.665792     2
              maxAbsZ        Var       Q pSupBr
1          2.0428e+00 7.4780e+00 0.74702      6
n          1.9600e+02 3.7994e+04 1.00554      1
sqrtN      2.0003e+01 4.9631e+02 0.89789      2
S1         1.9429e+00 5.1292e+00 0.85789      4
S2         1.9229e+00 4.9689e+00 0.86261      3
FH_p=0_q=0 2.0428e+00 7.4780e+00 0.74702      6
FH_p=0_q=1 3.9858e-01 3.1232e-01 0.71321      7
FH_p=1_q=0 1.9624e+00 5.2728e+00 0.85462      5
FH_p=1_q=1 2.6598e-01 1.5960e-01 0.66579      8
> attr(fit, "lrt")
                     Q         Var         Z pNorm
1          -4.8605e-01  7.4780e+00 -0.177740     3
n           3.0000e+01  3.7994e+04  0.153909     4
sqrtN       2.2308e-01  4.9631e+02  0.010014     8
S1         -7.4277e-02  5.1292e+00 -0.032796     6
S2         -5.6210e-02  4.9689e+00 -0.025216     7
FH_p=0_q=0 -4.8605e-01  7.4780e+00 -0.177740     3
FH_p=0_q=1 -3.9858e-01  3.1233e-01 -0.713209     1
FH_p=1_q=0 -8.7462e-02  5.2728e+00 -0.038089     5
FH_p=1_q=1 -2.6598e-01  1.5960e-01 -0.665792     2
> packageVersion("data.table")
[1] ‘1.14.6’
> 
> 
(base) tdhock@tdhock-MacBook:~/R$ R CMD INSTALL ~/R/data.table
Le chargement a nécessité le package : grDevices
* installing to library ‘/home/tdhock/lib/R/library’
* installing *source* package ‘data.table’ ...
** using staged installation
gcc 12.3.0
zlib 1.2.11 is available ok
R CMD SHLIB supports OpenMP without any extra hint
** libs
using C compiler: ‘gcc (GCC) 12.3.0’
gcc -shared -L/home/tdhock/lib/R/lib -L/home/tdhock/lib -Wl,-rpath=/home/tdhock/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o idatetime.o ijoin.o init.o inrange.o nafill.o negate.o nqrecreateindices.o openmp-utils.o programming.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -fopenmp -lz -L/home/tdhock/lib/R/lib -lR
PKG_CFLAGS = -fopenmp
PKG_LIBS = -fopenmp -lz
if [ "data.table.so" != "data_table.so" ]; then mv data.table.so data_table.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id data_table.so data_table.so; fi
installing to /home/tdhock/lib/R/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Le chargement a nécessité le package : grDevices
** help
*** installing help indices
** building package indices
Le chargement a nécessité le package : grDevices
** installing vignettes
** testing if installed package can be loaded from temporary location
Le chargement a nécessité le package : grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Le chargement a nécessité le package : grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)
(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> comp.res <- survMisc::comp(fit, p = sapply(wt, "[", 1), q = sapply(wt, "[", 2))
                     Q         Var         Z pNorm
1             15.51395     7.47802  5.673215     4
n            970.00000 37994.00000  4.976388     9
sqrtN          0.22308   496.31065  0.010014     3
S1            12.66143     5.12921  5.590591     8
S2            11.71357     4.96894  5.254815     6
FH_p=0_q=0    -0.48605     7.47802 -0.177740     2
FH_p=0_q=1     2.66342     0.31232  4.765805     7
FH_p=1_q=0    12.08527     5.27276  5.263051     5
FH_p=1_q=1    -0.26598     0.15960 -0.665792     1
              maxAbsZ        Var       Q pSupBr
1          1.5514e+01 7.4780e+00 5.67321      5
n          9.7000e+02 3.7994e+04 4.97639      4
sqrtN      2.0003e+01 4.9631e+02 0.89789      1
S1         1.2661e+01 5.1292e+00 5.59059      9
S2         1.1714e+01 4.9689e+00 5.25481      7
FH_p=0_q=0 2.0428e+00 7.4780e+00 0.74702      2
FH_p=0_q=1 2.6634e+00 3.1232e-01 4.76580      8
FH_p=1_q=0 1.2085e+01 5.2728e+00 5.26305      6
FH_p=1_q=1 2.6598e-01 1.5960e-01 0.66579      3
> attr(fit, "lrt")
                     Q         Var         Z pNorm
1             15.51395     7.47802  5.673215     4
n            970.00000 37994.00000  4.976388     9
sqrtN          0.22308   496.31065  0.010014     3
S1            12.66143     5.12921  5.590591     8
S2            11.71357     4.96894  5.254815     6
FH_p=0_q=0    -0.48605     7.47802 -0.177740     2
FH_p=0_q=1     2.66342     0.31232  4.765805     7
FH_p=1_q=0    12.08527     5.27276  5.263051     5
FH_p=1_q=1    -0.26598     0.15960 -0.665792     1
> packageVersion("data.table")
[1] ‘1.14.99’

Compare above (master) with below (CRAN)

> attr(fit, "lrt")
                     Q         Var         Z pNorm
1          -4.8605e-01  7.4780e+00 -0.177740     3
n           3.0000e+01  3.7994e+04  0.153909     4
sqrtN       2.2308e-01  4.9631e+02  0.010014     8
S1         -7.4277e-02  5.1292e+00 -0.032796     6
S2         -5.6210e-02  4.9689e+00 -0.025216     7
FH_p=0_q=0 -4.8605e-01  7.4780e+00 -0.177740     3
FH_p=0_q=1 -3.9858e-01  3.1233e-01 -0.713209     1
FH_p=1_q=0 -8.7462e-02  5.2728e+00 -0.038089     5
FH_p=1_q=1 -2.6598e-01  1.5960e-01 -0.665792     2

@tdhock
Copy link
Member Author

tdhock commented Jan 4, 2024

survMisc::ten() returns a data table of class "ten" which appears to be the same, when using DT master and CRAN versions. So I believe the different result must be computed somewhere in this function, which is called with x = one such data table,

> survMisc:::comp.ten
function (x, ..., p = 1, q = 1, scores = seq.int(attr(x, "ncg")), 
    reCalc = FALSE) 
{
    if (!reCalc & !is.null(attr(x, "lrt"))) {
        print(attr(x, "lrt"))
        print(if (!is.null(attr(x, "sup"))) 
            attr(x, "sup")
        else attr(x, "tft"))
        return(invisible())
    }
    stopifnot(attr(x, "ncg") >= 2)
    stopifnot(length(p) == length(q))
    fh1 <- length(p)
    if (!attr(x, "sorted") == "t") 
        data.table::setkey(x, t)
    t1 <- x[e > 0, t, by = t][, t]
    wt1 <- data.table::data.table(array(data = 1, dim = c(length(t1), 
        5L + fh1)))
    FHn <- paste("FH_p=", p, "_q=", q, sep = "")
    n1 <- c("1", "n", "sqrtN", "S1", "S2", FHn)
    data.table::setnames(wt1, n1)
    data.table::set(wt1, j = "n", value = x[e > 0, max(n), by = t][, 
        V1])
    data.table::set(wt1, j = "sqrtN", value = wt1[, sqrt(.SD), 
        .SDcols = "n"])
    data.table::set(wt1, j = "S1", value = cumprod(x[e > 0, 1 - 
        sum(e)/(max(n) + 1), by = t][, V1]))
    data.table::set(wt1, j = "S2", value = wt1[, S1] * x[e > 
        0, max(n)/(max(n) + 1), by = t][, V1])
    S3 <- sf(x = x[e > 0, sum(e), by = t][, V1], n = x[e > 0, 
        max(n), by = t][, V1], what = "S")
    S3 <- c(1, S3[seq.int(length(S3) - 1L)])
    wt1[, `:=`((FHn), mapply(function(p, q) S3^p * ((1 - S3)^q), 
        p, q, SIMPLIFY = FALSE))]
    n2 <- c("W", "Q", "Var", "Z", "pNorm", "chiSq", "df", "pChisq")
    res1 <- data.table::data.table(matrix(0, nrow = ncol(wt1), 
        ncol = length(n2)))
    data.table::setnames(res1, n2)
    data.table::set(res1, j = 1L, value = n1)
    predict(x)
    eMP1 <- attr(x, "pred")
    eMP1 <- eMP1[rowSums(eMP1) > 0, ]
    COV(x)
    cov1 <- attr(x, "COV")
    if (is.null(dim(cov1))) {
        cov1 <- cov1[names(cov1) %in% t1]
    }
    else {
        cov1 <- cov1[, , dimnames(cov1)[[3]] %in% t1]
    }
    ncg1 <- attr(x, "ncg")
    if (ncg1 == 2) {
        eMP1 <- unlist(eMP1[, .SD, .SDcols = (length(eMP1) - 
            1L)])
        data.table::set(res1, j = "Q", value = colSums(wt1 * 
            eMP1))
        data.table::set(res1, j = "Var", value = colSums(wt1^2 * 
            cov1))
        n3 <- c("W", "maxAbsZ", "Var", "Q", "pSupBr")
        res2 <- data.table::data.table(matrix(0, nrow = 5 + fh1, 
            ncol = length(n3)))
        data.table::setnames(res2, n3)
        data.table::set(res2, j = 1L, value = n1)
        data.table::set(res2, j = "maxAbsZ", value = sapply(abs(cumsum(eMP1 * 
            wt1)), max))
        data.table::set(res2, j = "Var", value = res1[, Var])
        res2[, `:=`("Q", maxAbsZ/sqrt(Var))]
        res2[, `:=`("pSupBr", sapply(Q, probSupBr))]
        data.table::setattr(res2, "class", c("sup", class(res2)))
    }
    if (ncg1 > 2) {
        df1 <- seq.int(ncg1 - 1L)
        eMP1 <- eMP1[, .SD, .SDcols = grep("eMP_", names(eMP1))]
        res3 <- data.table::data.table(array(0, dim = c(ncol(wt1), 
            4L)))
        data.table::setnames(res3, c("W", "chiSq", "df", "pChisq"))
        data.table::set(res3, j = 1L, value = n1)
        eMP1w <- apply(wt1, MARGIN = 2, FUN = function(wt) colSums(sweep(eMP1, 
            MARGIN = 1, STATS = wt, FUN = "*")))
        cov1w <- apply(wt1, MARGIN = 2, FUN = function(wt) rowSums(sweep(cov1, 
            MARGIN = 3, STATS = wt^2, FUN = "*"), dims = 2))
        dim(cov1w) <- c(ncg1, ncg1, ncol(cov1w))
        cov1ws <- cov1w[df1, df1, ]
        cov1ws <- apply(cov1ws, MARGIN = 3, FUN = solve)
        dim(cov1ws) <- c(max(df1), max(df1), length(n1))
        eMP1ss <- eMP1w[df1, ]
        data.table::set(res3, j = "chiSq", value = sapply(seq.int(length(n1)), 
            function(i) eMP1ss[, i] %*% cov1ws[, , i] %*% eMP1ss[, 
                i]))
        res3[, `:=`("df", max(df1))]
        res3[, `:=`("pChisq", 1 - stats::pchisq(chiSq, df))]
        data.table::setattr(res3, "class", c("lrt", class(res3)))
        sAC1 <- as.matrix(expand.grid(scores, scores))
        scoProd1 <- apply(sAC1, MARGIN = 1, FUN = prod)
        data.table::set(res1, j = "Q", value = colSums(eMP1w * 
            scores))
        data.table::set(res1, j = "Var", value = abs(apply(cov1w * 
            scoProd1, MARGIN = 3, sum)))
    }
    res1[, `:=`("Z", Q/sqrt(Var))]
    res1[, `:=`("pNorm", 2 * (1 - stats::pnorm(abs(Z))))]
    res1[, `:=`("chiSq", Q^2/Var)]
    res1[, `:=`("df", 1)]
    res1[, `:=`("pChisq", 1 - stats::pchisq(chiSq, df))]
    data.table::setattr(res1, "class", c("lrt", class(res1)))
    data.table::set(wt1, j = "t", value = t1)
    data.table::setattr(x, "lrw", wt1)
    if (ncg1 == 2) {
        data.table::setattr(x, "lrt", res1)
        data.table::setattr(x, "sup", res2)
    }
    else {
        data.table::setattr(x, "lrt", res3)
        res1 <- list(tft = res1, scores = scores)
        data.table::setattr(x, "tft", res1)
    }
    print(attr(x, "lrt"))
    print(if (!is.null(attr(x, "sup"))) 
        attr(x, "sup")
    else attr(x, "tft"))
    return(invisible())
}
<bytecode: 0xa92caa8>
<environment: namespace:survMisc>

@tdhock
Copy link
Member Author

tdhock commented Jan 4, 2024

Here is an even smaller survMisc-bug.R, in which I copied some of the code from survMisc::comp() below, to show the specific code which results in a difference:

set.seed(1234)
y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
predict(fit)
eMP1 <- attr(fit, "pred")
eMP1 <- eMP1[rowSums(eMP1) > 0, ]
unlist(eMP1[, .SD, .SDcols = (length(eMP1) - 1L)])
packageVersion("data.table")

The important line which results in a difference is the second to last one, with .SDcols.
Running that in the two versions shows two different results, shown below.

(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> predict(fit)
           P_1       P_2      eMP_1      eMP_2
  1: 0.4952381 0.5047619  0.4952381 -0.4952381
  2: 0.5000000 0.5000000 -0.5000000  0.5000000
  3: 0.0000000 0.0000000  0.0000000  0.0000000
  4: 0.0000000 0.0000000  0.0000000  0.0000000
  5: 0.5049505 0.4950495  0.5049505 -0.5049505
 ---                                          
101: 0.0000000 0.0000000  0.0000000  0.0000000
102: 0.0000000 0.0000000  0.0000000  0.0000000
103: 0.0000000 0.0000000  0.0000000  0.0000000
104: 0.0000000 0.0000000  0.0000000  0.0000000
105: 0.0000000 0.0000000  0.0000000  0.0000000
> eMP1 <- attr(fit, "pred")
> eMP1 <- eMP1[rowSums(eMP1) > 0, ]
> unlist(eMP1[, .SD, .SDcols = (length(eMP1) - 1L)])
    eMP_11     eMP_12     eMP_13     eMP_14     eMP_15     eMP_16     eMP_17 
 0.4952381 -0.5000000  0.5049505  0.5102041  0.5104167 -0.4946237  0.5054945 
    eMP_18     eMP_19    eMP_110    eMP_111    eMP_112    eMP_113    eMP_114 
 0.5111111 -0.4831461 -0.4827586 -0.4941176 -0.4939759 -0.4864865  0.5068493 
   eMP_115    eMP_116    eMP_117    eMP_118    eMP_119    eMP_120    eMP_121 
-0.4929577  0.4927536  0.5161290  0.5166667  0.5294118 -0.4583333 -0.4680851 
   eMP_122    eMP_123    eMP_124    eMP_125    eMP_126    eMP_127    eMP_128 
-0.4782609 -0.4772727  0.5000000 -0.4878049  0.5151515 -0.4666667  0.5172414 
   eMP_129    eMP_130 
-0.4642857 -0.3888889 
> packageVersion("data.table")
[1] ‘1.14.6’
> 
> 
(base) tdhock@tdhock-MacBook:~/R$ R CMD INSTALL ~/R/data.table
Le chargement a nécessité le package : grDevices
* installing to library ‘/home/tdhock/lib/R/library’
* installing *source* package ‘data.table’ ...
** using staged installation
gcc 12.3.0
zlib 1.2.11 is available ok
R CMD SHLIB supports OpenMP without any extra hint
** libs
using C compiler: ‘gcc (GCC) 12.3.0’
gcc -shared -L/home/tdhock/lib/R/lib -L/home/tdhock/lib -Wl,-rpath=/home/tdhock/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o idatetime.o ijoin.o init.o inrange.o nafill.o negate.o nqrecreateindices.o openmp-utils.o programming.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -fopenmp -lz -L/home/tdhock/lib/R/lib -lR
PKG_CFLAGS = -fopenmp
PKG_LIBS = -fopenmp -lz
if [ "data.table.so" != "data_table.so" ]; then mv data.table.so data_table.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id data_table.so data_table.so; fi
installing to /home/tdhock/lib/R/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Le chargement a nécessité le package : grDevices
** help
*** installing help indices
** building package indices
Le chargement a nécessité le package : grDevices
** installing vignettes
** testing if installed package can be loaded from temporary location
Le chargement a nécessité le package : grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Le chargement a nécessité le package : grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)
(base) tdhock@tdhock-MacBook:~/R$ R --vanilla < survMisc-bug.R

R Under development (unstable) (2023-12-22 r85721) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> set.seed(1234)
> y <- simtrial::sim_pw_surv(n = 300) |> simtrial::cut_data_by_event(30)
> wt <- list(a1 = c(0, 0), a2 = c(0, 1), a3 = c(1, 0), a4 = c(1, 1))
> fit <- survMisc::ten(survival::Surv(y$tte, y$event) ~ y$treatment, data = y)
> predict(fit)
           P_1       P_2      eMP_1      eMP_2
         <num>     <num>      <num>      <num>
  1: 0.4952381 0.5047619  0.4952381 -0.4952381
  2: 0.5000000 0.5000000 -0.5000000  0.5000000
  3: 0.0000000 0.0000000  0.0000000  0.0000000
  4: 0.0000000 0.0000000  0.0000000  0.0000000
  5: 0.5049505 0.4950495  0.5049505 -0.5049505
 ---                                          
101: 0.0000000 0.0000000  0.0000000  0.0000000
102: 0.0000000 0.0000000  0.0000000  0.0000000
103: 0.0000000 0.0000000  0.0000000  0.0000000
104: 0.0000000 0.0000000  0.0000000  0.0000000
105: 0.0000000 0.0000000  0.0000000  0.0000000
> eMP1 <- attr(fit, "pred")
> eMP1 <- eMP1[rowSums(eMP1) > 0, ]
> unlist(eMP1[, .SD, .SDcols = (length(eMP1) - 1L)])
      P_11       P_12       P_13       P_14       P_15       P_16       P_17 
 0.4952381  0.5000000  0.5049505  0.5102041  0.5104167  0.5053763  0.5054945 
      P_18       P_19      P_110      P_111      P_112      P_113      P_114 
 0.5111111  0.5168539  0.5172414  0.5058824  0.5060241  0.5135135  0.5068493 
     P_115      P_116      P_117      P_118      P_119      P_120      P_121 
 0.5070423  0.4927536  0.5161290  0.5166667  0.5294118  0.5416667  0.5319149 
     P_122      P_123      P_124      P_125      P_126      P_127      P_128 
 0.5217391  0.5227273  0.5000000  0.5121951  0.5151515  0.5333333  0.5172414 
     P_129      P_130       P_21       P_22       P_23       P_24       P_25 
 0.5357143  0.6111111  0.5047619  0.5000000  0.4950495  0.4897959  0.4895833 
      P_26       P_27       P_28       P_29      P_210      P_211      P_212 
 0.4946237  0.4945055  0.4888889  0.4831461  0.4827586  0.4941176  0.4939759 
     P_213      P_214      P_215      P_216      P_217      P_218      P_219 
 0.4864865  0.4931507  0.4929577  0.5072464  0.4838710  0.4833333  0.4705882 
     P_220      P_221      P_222      P_223      P_224      P_225      P_226 
 0.4583333  0.4680851  0.4782609  0.4772727  0.5000000  0.4878049  0.4848485 
     P_227      P_228      P_229      P_230     eMP_11     eMP_12     eMP_13 
 0.4666667  0.4827586  0.4642857  0.3888889  0.4952381 -0.5000000  0.5049505 
    eMP_14     eMP_15     eMP_16     eMP_17     eMP_18     eMP_19    eMP_110 
 0.5102041  0.5104167 -0.4946237  0.5054945  0.5111111 -0.4831461 -0.4827586 
   eMP_111    eMP_112    eMP_113    eMP_114    eMP_115    eMP_116    eMP_117 
-0.4941176 -0.4939759 -0.4864865  0.5068493 -0.4929577  0.4927536  0.5161290 
   eMP_118    eMP_119    eMP_120    eMP_121    eMP_122    eMP_123    eMP_124 
 0.5166667  0.5294118 -0.4583333 -0.4680851 -0.4782609 -0.4772727  0.5000000 
   eMP_125    eMP_126    eMP_127    eMP_128    eMP_129    eMP_130 
-0.4878049  0.5151515 -0.4666667  0.5172414 -0.4642857 -0.3888889 
> packageVersion("data.table")
[1] ‘1.14.99’
> 
> 
(base) tdhock@tdhock-MacBook:~/R$ 

looks like this is a real bug in master.
the line of code which gives an incorrect result using master is unlist(eMP1[, .SD, .SDcols = (length(eMP1) - 1L)])
eMP1 is a data table with 4 columns, so the revdep author wants .SDcols=3, but for some reason it evaluates to .SDcols=1:3.
A minimal reproducible example, which we should add as a test case, is below,

> x=data.table(a=1,b=2,c=3)
> x[, .SD, .SDcols=length(x)-1]
       a     b
   <num> <num>
1:     1     2
> x[, .SD, .SDcols=2]
       b
   <num>
1:     2

The two results above should be the same, but are not.

@tdhock tdhock added this to the 1.15.0 milestone Jan 4, 2024
@MichaelChirico
Copy link
Member

Great work on the MRE! I think I have a fix ready. Pretty obvious bug, surprised it's taken this long to realize!

@tdhock
Copy link
Member Author

tdhock commented Jan 5, 2024

confirming that this revdep is now fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
revdep Reverse dependencies
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants