Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling competing risks in rfsrc/proba #230

Open
funnell opened this issue Nov 22, 2021 · 7 comments
Open

Handling competing risks in rfsrc/proba #230

funnell opened this issue Nov 22, 2021 · 7 comments

Comments

@funnell
Copy link

funnell commented Nov 22, 2021

Expected Behaviour

Benchmarking to complete for a competing risks model

Actual Behaviour

Error in dimnames(x) <- dn: length of 'dimnames' [2] not equal to array extent
This error doesn't occur if I make sure the status variable is only 0 or 1.
It also doesn't occur if I just run learner$train(follic_task)

Reprex

library(mlr3verse)
#> Loading required package: mlr3
library(mlr3extralearners)
#> 
#> Attaching package: 'mlr3extralearners'
#> The following objects are masked from 'package:mlr3':
#> 
#>     lrn, lrns
library(randomForestSRC)
#> 
#>  randomForestSRC 2.14.0 
#>  
#>  Type rfsrc.news() to see new features, changes, and bug fixes. 
#> 
#> 
#> Attaching package: 'randomForestSRC'
#> The following object is masked from 'package:mlr3verse':
#> 
#>     tune
data(follic, package = "randomForestSRC")
follic_task <- as_task_surv(
  follic, event = "status", time = "time", type = "right"
)
learner = lrn("surv.rfsrc")
benchmark(
  benchmark_grid(
    tasks=list(follic_task), learners=list(learner), resamplings=rsmp("cv", folds=3)
  )
)
#> INFO  [13:58:06.277] [mlr3] Running benchmark with 3 resampling iterations 
#> INFO  [13:58:06.354] [mlr3] Applying learner 'surv.rfsrc' on task 'follic' (iter 1/3)
#> Error in dimnames(x) <- dn: length of 'dimnames' [2] not equal to array extent

Created on 2021-11-22 by the reprex package (v2.0.1)

Session info
sessionInfo()
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-apple-darwin13.4.0 (64-bit)
#> Running under: macOS Catalina 10.15.7
#> 
#> Matrix products: default
#> BLAS/LAPACK: /Users/funnellt/miniconda3/envs/mbml/lib/libopenblasp-r0.3.18.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] randomForestSRC_2.14.0   mlr3extralearners_0.5.15 mlr3verse_0.2.2         
#> [4] mlr3_0.13.0             
#> 
#> loaded via a namespace (and not attached):
#>  [1] fs_1.5.0             RColorBrewer_1.1-2   bbotk_0.4.0         
#>  [4] data.tree_1.0.0      mlr3proba_0.4.2      mlr3pipelines_0.4.0 
#>  [7] mlr3learners_0.5.0   tools_4.1.1          backports_1.3.0     
#> [10] utf8_1.2.2           R6_2.5.1             DBI_1.1.1           
#> [13] colorspace_2.0-2     mlr3data_0.5.0       withr_2.4.2         
#> [16] mlr3viz_0.5.7        mlr3misc_0.9.5       tidyselect_1.1.1    
#> [19] compiler_4.1.1       cli_3.1.0            ooplah_0.1.0        
#> [22] lgr_0.4.3            scales_1.1.1         checkmate_2.0.0     
#> [25] palmerpenguins_0.1.0 mlr3tuning_0.9.0     stringr_1.4.0       
#> [28] digest_0.6.28        rmarkdown_2.11       param6_0.2.3        
#> [31] paradox_0.7.1        set6_0.2.3           pkgconfig_2.0.3     
#> [34] htmltools_0.5.2      parallelly_1.28.1    fastmap_1.1.0       
#> [37] highr_0.9            htmlwidgets_1.5.4    rlang_0.4.12        
#> [40] visNetwork_2.1.0     generics_0.1.1       jsonlite_1.7.2      
#> [43] dplyr_1.0.7          magrittr_2.0.1       Matrix_1.3-4        
#> [46] Rcpp_1.0.7           mlr3fselect_0.6.0    munsell_0.5.0       
#> [49] fansi_0.5.0          lifecycle_1.0.1      stringi_1.7.5       
#> [52] yaml_2.2.1           grid_4.1.1           parallel_4.1.1      
#> [55] dictionar6_0.1.3     listenv_0.8.0        crayon_1.4.2        
#> [58] lattice_0.20-45      splines_4.1.1        mlr3cluster_0.1.2   
#> [61] knitr_1.35           pillar_1.6.4         mlr3filters_0.4.2   
#> [64] uuid_1.0-3           future.apply_1.8.1   codetools_0.2-18    
#> [67] reprex_2.0.1         glue_1.5.0           evaluate_0.14       
#> [70] data.table_1.14.2    vctrs_0.3.8          distr6_1.6.2        
#> [73] gtable_0.3.0         purrr_0.3.4          clue_0.3-60         
#> [76] future_1.23.0        assertthat_0.2.1     ggplot2_3.3.5       
#> [79] xfun_0.27            pracma_2.3.3         survival_3.2-13     
#> [82] tibble_3.1.6         cluster_2.1.2        DiagrammeR_1.0.6.1  
#> [85] globals_0.14.0       ellipsis_0.3.2       clusterCrit_1.2.8
@RaphaelS1
Copy link
Collaborator

This isn't a bug. mlr3proba doesn't currently support competing risks. When it does we will add this as a property for learners that handle it. For now the above behaviour is expected

@RaphaelS1
Copy link
Collaborator

If you can demonstrate the same problem with a non-competing risks task however then it may be a bug!

@funnell
Copy link
Author

funnell commented Nov 23, 2021

@RaphaelS1 would this issue be better placed in another repo, or closed?
Also, are there immediate plans to support competing risk tasks? And if so, would that effort be something an MLR3 novice can easily contribute to?

@RaphaelS1
Copy link
Collaborator

@sebffischer can you transfer to mlr3proba?

@funnell whilst I usually love when someone volunteers to contribute anything, unfortunately this first requires an internal design decision about how the implementation looks. After that the coding implementation is relatively straightforward.

Any preliminary thoughts on this @adibender ?

@adibender
Copy link
Collaborator

The Surv object does allow factor variables for the event in order to indicate CR/Multi-state outcomes. So specifying the task should be possible. How this is passed to the individual algorithms will be very heterogeneous, however, and not all algos will have customized methods like RFSRC. For those algos that don't have specialised methods, we would have to split the task in K tasks internally (one for each competing outcome) then fit the algos to each of them and aggregate the results afterwards and during evaluation... not sure if mlr3 was designed for this, but maybe through pipelines? Or how is multi-task classifcation handled for example?

@sebffischer sebffischer transferred this issue from mlr-org/mlr3extralearners Nov 24, 2021
@RaphaelS1
Copy link
Collaborator

We'll find time for a design meeting to discuss properly, not an easy answer... Pipelines seems wrong because it's too specialised

@RaphaelS1 RaphaelS1 changed the title [BUGLRN] Bugs in learner surv.rfsrc when benchmarking Handling competing risks in rfsrc/proba Dec 3, 2021
@bblodfon
Copy link
Collaborator

Discussed with Andreas, we should move this forward at some point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants