Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

canaper: Categorical Analysis of Neo- And Paleo-Endemism in R #475

Closed
9 of 11 tasks
joelnitta opened this issue Oct 26, 2021 · 51 comments
Closed
9 of 11 tasks

canaper: Categorical Analysis of Neo- And Paleo-Endemism in R #475

joelnitta opened this issue Oct 26, 2021 · 51 comments
Assignees
Labels

Comments

@joelnitta
Copy link

joelnitta commented Oct 26, 2021

Date accepted: 2022-09-14
Submitting Author: @joelnitta
Other Package Authors: (delete if none) Shawn Laffan (@shawnlaffan)
Repository: https:/joelnitta/canaper
Version submitted: 0.0.1
Submission type: Stats
Badge grade: silver
Editor: @tdhock
Reviewers: @KlausVigo, @luismurao

Due date for @KlausVigo: 2021-11-24

Due date for @luismurao: 2021-12-21
Archive: TBD
Version accepted: TBD


  • Paste the full DESCRIPTION file inside a code block below:
Package: canaper
Title: Categorical Analysis of Neo- And Paleo-Endemism
Version: 0.0.1
Authors@R: 
    c(
    person(given = "Joel H.",
           family = "Nitta",
           role = c("aut", "cre"),
           email = "[email protected]",
           comment = c(ORCID = "0000-0003-4719-7472")),
    person(given = "Shawn W.",
           family = "Laffan",
           role = c("ctb", "dtc")),
    person(given = "Brent D.",
           family = "Mishler",
           role = c("ctb", "dtc")),
    person(given = "Wataru",
           family = "Iwasaki",
           role = c("ctb"),
           comment = c(ORCID = "0000-0002-9169-9245"))
           )
Description: Provides functions to conduct categorical analysis of neo- and paleo-endemism (CANAPE).
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c ("collate", "namespace", "rd", "roxyglobals::global_roclet", "srr::srr_stats_roclet"))
RoxygenNote: 7.1.2
URL: https:/joelnitta/canaper
BugReports: https:/joelnitta/canaper/issues
Imports: 
    ape,
    assertr,
    assertthat,
    dplyr,
    future.apply,
    phyloregion,
    progressr,
    purrr,
    stats,
    tibble,
    vegan
Suggests: 
    rmarkdown,
    knitr,
    future,
    tictoc,
    patchwork,
    tidyverse,
    testthat (>= 3.0.0),
    roxyglobals (>= 0.2.1),
    stringr,
    magrittr,
    covr,
    picante
Config/testthat/edition: 3
Depends: 
    R (>= 3.5.0)
VignetteBuilder: knitr
Remotes: 
    anthonynorth/roxyglobals

Pre-submission Inquiry

  • A pre-submission inquiry has been approved in issue #469

General Information

  • Who is the target audience and what are scientific applications of this package?

    • Target audience is evolutionary biologists. Scientific application is the analysis of phylogenetic endemism.
  • Paste your responses to our General Standard G1.1 here, describing whether your software is:

    • The first implementation within R of an algorithm which has previously been implemented in other languages or contexts

    • Please include hyperlinked references to all other relevant software: https:/shawnlaffan/biodiverse

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

    • (Not applicable)

Note also there is a package website available: https://joelnitta.github.io/canaper/

Badging

  • What grade of badge are you aiming for? (bronze, silver, gold)

    • Silver, but open to suggestions from the reviewers and editors.
  • If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold)

    • Demonstrates excellence in compliance with multiple standards from at least two broad sub-categories
    • Compliance with a good number of standards beyond those identified as minimally necessary. I am in compliance with most of the standards for "General" and "Dimensionality Reduction, Clustering, and Unsupervised Learning".

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

    • Also, I intend to submit a paper about this package to Methods in Ecology and Evolution after ROpenSci review.

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • 'author1' variable must be GitHub hanle only ('@myhandle')
    Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for canaper (v0.0.1)

git hash: 43a9f0a2

  • ✔️ Package name is available
  • ✔️ has a 'CITATION' file.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 100%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

  • Dimensionality Reduction, Clustering and Unsupervised Learning

✔️ All applicable standards [v0.1.0] have been documented in this package

Click here to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 11 files) and
  • 1 authors
  • 3 vignettes
  • 8 internal data files
  • 11 imported packages
  • 4 exported functions (median 41 lines of code)
  • 30 non-exported functions in R (median 7 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 11 59.3
files_vignettes 2 83.0
files_tests 8 85.7
loc_R 650 52.9
loc_vignettes 162 64.9
loc_tests 939 83.5
num_vignettes 3 93.1
data_size_total 66301 81.2
data_size_median 509 59.9
n_fns_r 34 35.6
n_fns_r_exported 4 15.6
n_fns_r_not_exported 30 42.6
n_fns_per_file_r 3 42.7
num_params_per_fn 4 67.6
loc_per_fn_r 10 40.0
loc_per_fn_r_exp 41 75.3
loc_per_fn_r_not_exp 7 29.9
rel_whitespace_R 13 43.9
rel_whitespace_vignettes 0 0.0 TRUE
rel_whitespace_tests 7 84.7
doclines_per_fn_exp 55 68.2
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 9 22.0

2a. Network visualisation

Interactive network visualisation of calls between objects in package can be viewed by clicking here


3. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

github

GitHub Workflow Results

name conclusion sha date
pkgdown success 43a9f0 2021-10-26
R-CMD-check success 43a9f0 2021-10-26
Render codemeta failure 43a9f0 2021-10-26
test-coverage success 43a9f0 2021-10-26

3b. goodpractice results

R CMD check with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 100

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function cyclocomplexity
cpr_rand_test 29
calc_biodiv_random 16

Static code analyses with lintr

lintr found the following 353 potential issues:

message number of times
Lines should not be more than 80 characters. 353


Package Versions

package version
pkgstats 0.0.2.16
pkgcheck 0.0.2.86
srr 0.0.1.120


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@tdhock
Copy link

tdhock commented Oct 27, 2021

@ropensci-review-bot seeking reviewers

@ropensci-review-bot
Copy link
Collaborator

I'm sorry @tdhock, I'm afraid I can't do that. That's something only editors are allowed to do.

@noamross
Copy link
Contributor

@ropensci-review-bot assign @tdhock as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @tdhock is now the editor

@joelnitta
Copy link
Author

I forgot to mention, I plan to submit this as a paper to Methods in Ecology and Evolution after ROpenSci review (have also added this to the original description)

@tdhock
Copy link

tdhock commented Oct 29, 2021

@ropensci-review-bot assign @KlausVigo as reviewer

@mpadge
Copy link
Member

mpadge commented Nov 2, 2021

@tdhock Only just realised the command there has the wrong syntax - should be add <@name> to reviewers, like this example. Could you please try again? Nevertheless good that you got this one wrong, because it made us realise we need the bot to tell you it didn't understand. Thanks!

@tdhock
Copy link

tdhock commented Nov 2, 2021

@ropensci-review-bot add @KlausVigo to reviewers
by the way it would be more user friendly to have the add editor and add reviewer commands be consistent / analogous, I guess that is what you are doing in ropensci-org/buffy#42 ?

@mpadge
Copy link
Member

mpadge commented Nov 2, 2021

sorry @tdhock, another bot glitch there - bot commands have to be kept as just that, with no additional comments. And yes, the point of the above-linked issue is to aid consistency between analogous commands, so that will hopefully improve soon. In the meantime, could you please try again, again, with just the command? With due apologies and gratitude! 🙇‍♂️

@tdhock
Copy link

tdhock commented Nov 2, 2021

@ropensci-review-bot add @KlausVigo to reviewers

@ropensci-review-bot
Copy link
Collaborator

Can't assign reviewer because there is no editor assigned for this submission yet

@maelle
Copy link
Member

maelle commented Nov 3, 2021

@tdhock sorry for all the hiccups! The initial issue body had been edited after your username was added so your username was no longer in the string <!--editor--> @tdhock<!--end-editor--> I edited it again so you can try again. Thanks for your patience!

@tdhock
Copy link

tdhock commented Nov 3, 2021

@ropensci-review-bot add @KlausVigo to reviewers

@ropensci-review-bot
Copy link
Collaborator

@KlausVigo added to the reviewers list. Review due date is 2021-11-24. Thanks @KlausVigo for accepting to review! Please refer to our reviewer guide.

@ropensci-review-bot
Copy link
Collaborator

@KlausVigo: If you haven't done so, please fill this form for us to update our reviewers records.

@tdhock
Copy link

tdhock commented Nov 30, 2021

@ropensci-review-bot add @luismurao to reviewers

@ropensci-review-bot
Copy link
Collaborator

@luismurao added to the reviewers list. Review due date is 2021-12-21. Thanks @luismurao for accepting to review! Please refer to our reviewer guide.

@ropensci-review-bot
Copy link
Collaborator

@luismurao: If you haven't done so, please fill this form for us to update our reviewers records.

@luismurao
Copy link

luismurao commented Dec 29, 2021

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing:

  • ~ 9

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.


Review Comments

  • The canaper package provides functions to infer the evolutionary processes underlying neoendemism and paleoendemism. The software is an R implementation of the Categorical Analysis of Neo- and Paleo-endemism, which is only available in the Biodiverse software.

  • I enjoyed reviewing the package, but I found some things that might improve the package.

    • One concern is about parallelization: in the current stage of the package, parallelization calls are done outside the functions (via future::plan(future::multisession)). I think it would be better to have parameters for parallelization in functions that allow running analysis in parallel (so calls are done inside them).

Parallelization calls as in the current version of the package

library(canaper) # This package :)
library(future) # For parallel computing
library(tictoc) # For timing things
library(patchwork) # For composing mult-part plots
library(tidyverse) # For data-wrangling and plottin

data(acacia)

plan(multisession,workers=ncores)
set.seed(071421)
acacia_rand_res <- cpr_rand_test(
  acacia$comm, acacia$phy, 
  null_model = "curveball",
  n_reps = 100, n_iterations = 100000,
  tbl_out = TRUE)
plan(sequential)

I think this could be more appropriate and easy to use

acacia_rand_res <- cpr_rand_test(
  acacia$comm, acacia$phy, 
  null_model = "curveball",
  n_reps = 100, n_iterations = 100000,
  tbl_out = TRUE
  parallel = TRUE, # Logical var to run processes in parallel
  nworkes = 3, # Number of threads
  random_seed=071421) # Random seed
  • Instead of suggesting the future package I would import it

README

  • No code of conduct and contribution guidelines, please add it

Installation

There is a potential conflict in package dependencies as canaper depends on R (>= 3.5.0), but it imports phangorn version 2.8.1, which depends on R (>= 4.1.0). Please change parameter Depends: R (>= 3.5.0) to Depends: R (>= 4.1.0) in the description file or provide a way to install an earlier version of phangorn (version 2.5.5 should work).

  • I installed the package locally on macOS (Big Sur), Linux (Ubuntu 20.04.3 & 21.04), and Windows 10

Automated test

  • devtools::check() takes more than six minutes. Building of vignette outputs takes a lot of time (5m 5.2s). Consider using lees number of replicates in examples (i.e. n_reps of canaper::cpr_rand_test).
    • I got one error after upgrading from GEOS 3.9 to 3.10 in Linux. The issue is related to phyloregion dependency.
══ Warnings ════════════════════════════════════════════════════════════════════
── Warning (test-utils.R:143:3): Functions copied from phyloregion work ────────
rgeos: versions of GEOS runtime 3.10.1-CAPI-1.16.0
and GEOS at installation 3.9.1-CAPI-1.14.2differ
Backtrace:
	1. testthat::expect_equal(dense2sparse(biod_example$comm), phyloregion::dense2sparse(biod_example$comm)) test-utils.R:143:2
4. base::loadNamespace(x)
7. base::loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
8. base:::runHook(".onLoad", env, package.lib, package)
13. rgeos:::fun(libname, pkgname)
  • devtools::test() finished with no fails and one warning
────────────────────────────────────────────────────────────────────
Warning (test-utils.R:143:3): Functions copied from phyloregion work
rgeos: versions of GEOS runtime 3.10.1-CAPI-1.16.0
and GEOS at installation 3.9.1-CAPI-1.14.2differ
Backtrace:
  1. testthat::expect_equal(dense2sparse(biod_example$comm), phyloregion::dense2sparse(biod_example$comm)) test-utils.R:143:2
  4. base::loadNamespace(x)
  7. base::loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
  8. base:::runHook(".onLoad", env, package.lib, package)
 13. rgeos:::fun(libname, pkgname)
────────────────────────────────────────────────────────────────────

══ Results ═════════════════════════════════════════════════════════
Duration: 71.8 s

[ FAIL 0 | WARN 1 | SKIP 0 | PASS 255 ]
  • devtools::spell_check() Change "signficance" to "significance" in line 132
signficance       canape.Rmd:132

Performance

When benchmarking, I found a strange behavior. After using more than four threads, the program takes more time to finish a randomization test. Here is the code I used (could you please check it? I might be missing something)

library(purrr)
library(tidyverse)
library(tictoc)
library(future)
library(ggplot2)
rm(list = ls())
data(acacia)
acom <- acacia$comm
acom[acom>0] <- 1
n_reps_vec <- c(100,500,1000,1500,2000)
nwokers <- 1:7
acom <- acacia$comm
acom[acom>0] <- 1

rand_times <- seq_along(n_reps_vec) %>% purrr::map_dfr(function(nrep){
  rand_t <- nwokers %>% purrr::map_dfr(function(nwork){
      if(nwork>1){
        plan(multisession,workers=nwork)
      }
      set.seed(071421)
      tictoc::tic() 
      # Run randomization test
      acacia_rand_res <- canaper::cpr_rand_test(
        acom, acacia$phy,
        null_model = "curveball",
        n_reps = n_reps_vec[nrep],
        n_iterations = 100000,
        tbl_out = TRUE)
      aa <- tictoc::toc()
      r1 <-data.frame(n_reps = n_reps_vec[nrep] ,
                      nworkers=nwork,
                      time_secs = aa$toc - aa$tic)
      plan(sequential)
      return(r1)
    })
  return(rand_t)
})

ggplot(rand_times, aes(x = ncores,y=time_secs,color=as.factor(n_reps))) +
  geom_line() + theme_classic()

Questions and comments

  • I was wondering if canaper has already all methods of the CANAPE version of Biodiverse. Could write something about it in the README?
  • Although the vignettes are very useful and well documented, it would be nice to have methods for plotting spatial data results.

@joelnitta please do not hesitate to contact me if you have any questions. Thanks for developing canaper.

@joelnitta
Copy link
Author

@luismurao thanks so much for your review! I will try to address your comments soon.

@joelnitta
Copy link
Author

@KlausVigo thanks for making that change: I switched back to importing phyloregion, and it passes the tests for consistent results using the same seed in parallel.

I just have one request: would you consider bumping the phyloregion version so that I can require the version with this fix in Imports? That will prevent this bug popping up for others.

Hi @joelnitta,

I just made a commit to phyloregion cb819fb. This might solves the problem, at least sessionInfo() suggests that snow is not loaded via a namespace any more.

@KlausVigo
Copy link

@joelnitta nice that this works and it seems to simplify things.
I wasn't aware that ImportFrom(betapart, beta.pair) and betaprt::beta.pair made such a large difference with the NAMESPACE. I naively assumed it is about the same and the documentation about these issues does not tell you otherwise. @darunabas is the maintainer of phyloregion and has to submit it to CRAN. We probably need a week or so to clean up all changes and get the error messages sorted out as we haven't updated the package for a while.

@KlausVigo thanks for making that change: I switched back to importing phyloregion, and it passes the tests for consistent results using the same seed in parallel.

I just have one request: would you consider bumping the phyloregion version so that I can require the version with this fix in Imports? That will prevent this bug popping up for others.

@joelnitta
Copy link
Author

joelnitta commented Mar 9, 2022

@KlausVigo no problem: it is a very subtle error, and one that I would never have resolved without the help of @HenrikBengtsson. As he has mentioned this is likely a much wider problem across many packages on CRAN.

If you and @darunabas are able to release a new version of phyloregion in a reasonable amount of time on CRAN that would be great.

@shawnlaffan
Copy link

@KlausVigo - if you are updating phyloregion then it would be good to also address darunabas/phyloregion#3

@tdhock
Copy link

tdhock commented May 9, 2022

I am wondering if we can move forward on this submission? @KlausVigo @luismurao have the comments/suggestions in your reviews been addressed?

@joelnitta
Copy link
Author

@tdhock Sorry, I am the cause for the delay! Since the reviews came back I've gotten busy with other projects, and haven't been able to get back to this yet. I anticipate being occupied by other projects for another week or two. Please remind me again if you don't hear something soon after that.

@emilyriederer
Copy link

Hi @joelnitta and @tdhock ! I'm currently the rotating editor in chief for rOpenSci submissions. I wanted to check in and see if you expect to have time to revisit the review feedback soon? Otherwise, we may want to put this review on hold status

@joelnitta
Copy link
Author

joelnitta commented Jul 10, 2022

Thanks for checking in @emilyriederer. Sorry for falling behind on this. I am still occupied with other things at the moment, but I should be able to respond to the reviews by the end of August if not sooner.

@joelnitta
Copy link
Author

joelnitta commented Aug 18, 2022

Response to reviews

I am sorry for taking so long to respond to the reviewers. Thanks again to both reviewers for their helpful comments. Below I respond to each comment as appropriate.

Response to @luismurao

  • Concern about parallelization. I think it would be better to have parameters for parallelization in functions that allow running analysis in parallel (so calls are done inside them).

  • Response: The second reviewer, @KlausVigo disagreed with this point and was in favor of leaving the current approach to parallelization, so I am making no changes.

  • Instead of suggesting the future package I would import it

  • Response: I disagree with this. Parallelization is not required, it is an option to speed up analyses if desired. So I think it makes sense as Suggests instead of Imports. No changes made.

README

  • No code of conduct and contribution guidelines, please add it
  • Response: There was apparently a misunderstanding; contribution guidelines are already provided at /.github/CONTRIBUTING.md. To make it easier to find though, I have added a link to this document from the README. I also added a link to the ROpenSci code of conduct using the suggested wording in the developer guide (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).

Installation

  • Please change parameter Depends: R (>= 3.5.0) to Depends: R (>= 4.1.0) in the description file or provide a way to install an earlier version of phangorn (version 2.5.5 should work).
  • Response: I changed to Depends: R (>= 4.1.0) (0f334965fb5a3fe5034eb1bc682b5051548eddb1).

Automated test

  • devtools::check() takes more than six minutes. Building of vignette outputs takes a lot of time (5m 5.2s). Consider using lees number of replicates in examples (i.e. n_reps of canaper::cpr_rand_test).

  • Response: I had to include a non-trivial number replicates to produce reasonable results (e.g., in comparison with Mishler et al. 2014 for the CANAPE example vignette) and to demonstrate performance gains from parallelization. But I was able to decrease the number of replicates in the vignettes (f3af4671d679020aeaa97b1a038b3b5a96633f42). The vignettes now take 3m 40s on my computer to compile.

  • I got one error after upgrading from GEOS 3.9 to 3.10 in Linux. The issue is related to phyloregion dependency.

  • Response: Yes: phylogregion imports rgeos, which is going to be retired soon and needs fixing. Correct installation of external dependencies to phyloregion is outside the scope of canaper. No change made (for now).

  • devtools::test() finished with no fails and one warning

  • Response: On my computer, devtools::test() now finishes with no fails or warnings (at 93543d399fd34dd6546e56368986a8afa133f190).

  • devtools::spell_check() Change "signficance" to "significance" in line 132

  • Response: Fixed (b04e2e0a8397568bcc615e4aaeb5a0d9b02ad85e)

Performance

  • When benchmarking, I found a strange behavior. After using more than four threads, the program takes more time to finish a randomization test.
  • Response: Speed is not expected to increase linearly with the number of cores. It is well documented in other software that increasing cores above a certain point causes code to run more slowly due to the resources required to coordinate threads. It is up to the user to identify the optimal number of threads to use. I provide a vignette article discussing precisely this topic: https://joelnitta.github.io/canaper/articles/parallel.html.

Questions and comments

  • I was wondering if canaper has already all methods of the CANAPE version of Biodiverse. Could write something about it in the README?

  • Response: I have added a section in the README comparing canaper other software (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).

  • Although the vignettes are very useful and well documented, it would be nice to have methods for plotting spatial data results.

  • Response: Thanks for the suggestion. I have already included color palettes for plotting results that should be accessible regardless of color vision type. IMHO there are too many customizations that users would want to make to try and implement a one-size-fits-all plotting method. Rather, the results are structured in a way that they should be easily plotted with e.g. ggplot2, and I demonstrate this in the vignettes.

Response to @KlausVigo

  • There are some functions in the picante (ses.pd) and in the PhyloMeasures package for Faith's phylogenetic diversity with a similar scope.

  • Response: There are many R packages that calculate community diversity metrics, and many of them overlap in common metrics like Faith's PD. In the case of canaper, I need to calculate Faith's PD as part of the CANAPE calculation. I use the phyloregion::PD() to do this as it is the fastest AFAIK. No changes needed.

  • The canaper package builds on functions from the vegan package to resample (dense) community matrices and borrows biodiversity measures (Faith's phylogenetic diversity and phylogenetic endemism) from the phyloregion package. The picante package also has some functions for swapping (community) matrices similar to vegan. A comparison between the packages would be useful.

  • Response: I have added a section to the README comparing canaper with other software (d8aaf8de5193166d52ab890d8267090ea2f8ef6a).

Specific comments

  • The functions from the phyloregion package used in the package should be acknowledged properly... An alternative is importing the functions from the package directly.

  • Response: Please see response above. Functions from the phyloregion package are now imported, not copied (7e8b24950912bfa2a5e10cf4dd9cfa8910d74e76). Please note that phyloregion is still at v1.0.7 as of writing (same version as before @KlausVigo implemented the change that fixed the loading of snow), so I can't specify the version of phyloregion that would avoid canaper issue #2.

  • In contrast to the other reviewer I really like the use of the future package.

  • Response: I agree, and have left parallelization as it is.

Profiling

  • Providing a swapping algorithm which can handle sparse matrices would be a huge improvement for the community.

  • Response: I agree implementing sparse matrix encoding for generating random matrices would be a huge benefit and will consider it as a feature to be added possibly in the future (any help would be appreciated!).

  • The functions calc_biodiv_random and cpr_rand_comm are only called internally and are called thousands of times inside of the loop. It should be save to remove all (most) of the calls to assertthat::assert_that from these functions as parameters have been checked already and are save to use!

  • Response: Actually, cpr_rand_comm() is exported, but point well-taken. I have deleted the redundant checks (bccd0278dd3de5853680d9979ca26c1aa38a7fb3).

For submitting to CRAN

General comments

  • Both reviewers left Performance: Any performance claims of the software been confirmed unchecked. I am not sure why this is the case, as I include tests of increase in performance using parallel computing and demonstrate this feature in a vignette.

  • I have added @KlausVigo as a reviewer, but not @luismurao, according to their responses to Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (6e9b3d55202e2294a6436280ca6b01db13d92974).

@joelnitta
Copy link
Author

Update to response to reviewers

The longest running test is "Parallelization decreases calculation time". To make this test robust, I have to include a relatively high number of replicates, otherwise by random chance sometimes running in parallel won't be faster than running sequentially.

I have decided to follow 6.1.5.4 Extended tests of rOpenSci Statistical Software Peer Review and treat this as an extended test that can be optionally skipped (it is now skipped by default).

I have documented this in a comment as follows:

  # This test is skipped by default because of its long run time.
  # To include the test locally, first run
  # withr::local_envvar(CANAPER_EXTENDED_TESTS = "true") # nolint
  # To include the test in CI (github actions), include the phrase
  # 'run-extended' in the commit

devtools::test() on my computer now finishes in 36 s skipping the extended test, and 181 s when including it.

@tdhock, @emilyriederer sorry again for the long delay, but this should be ready to move forward. Please let me know if there is anything you need from my end.

@emilyriederer
Copy link

Also tagging @KlausVigo @luismurao . I know this review was a bit delayed and your availability may have changed. At your convenience, if you could please review @joelnitta 's responses and share whether these address your feedback (template available here) , that would be fantastic.

@joelnitta
Copy link
Author

@emilyriederer If the reviewers have no further comments, can we proceed with this at your discretion as an editor?

@emilyriederer
Copy link

Hi @joelnitta - let me follow up with @tdhock since he's the managing editor on this one

@tdhock
Copy link

tdhock commented Sep 14, 2022

hi! Since the reviewers have been tagged two weeks ago and gave no feedback, I think it is OK to move ahead with this submission (the reviewer response is very complete/convincing).

@tdhock
Copy link

tdhock commented Sep 14, 2022

@ropensci-review-bot approve canaper

@ropensci-review-bot
Copy link
Collaborator

Approved! Thanks @joelnitta for submitting and @KlausVigo, @luismurao for your reviews! 😁

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You will need to enable two-factor authentication for your GitHub account.
    This invitation will expire after one week. If it happens write a comment @ropensci-review-bot invite me to ropensci/<package-name> which will re-send an invitation.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https:/ropensci/foobar
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

@joelnitta
Copy link
Author

Thanks @tdhock!

@joelnitta
Copy link
Author

@ropensci-review-bot finalize transfer of canaper

@ropensci-review-bot
Copy link
Collaborator

Transfer completed.
The canaper team is now owner of the repository and the author has been invited to the team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests