Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data sets for all variants of projects #2

Open
2 of 19 tasks
nutterb opened this issue Dec 8, 2015 · 8 comments
Open
2 of 19 tasks

Data sets for all variants of projects #2

nutterb opened this issue Dec 8, 2015 · 8 comments

Comments

@nutterb
Copy link
Contributor

nutterb commented Dec 8, 2015

I'm getting ready to jump back into this world of programming (I've been distracted by a couple other major projects for a while), and was happy to see this repository. I'm hoping to put together a few data sets that cover the complexity of scenarios that REDCap can cover so that I can adequately test redcapAPI against them. If you don't mind being a second (or third, or fourth) set of eyes, I'd like to make a checklist of the types of data sets we need to cover all the options.

If I know my REDCap sufficiently, the main options are

  • cross-sectional vs. longitudinal
  • single center vs. multicenter
  • form vs. survey vs. hybrid
  • single arm vs. multi-arm

A 2x2x3x2 permutation should yield 24 data sets (yes, probably over ambitious, but that's just who I am :) ). Did I miss anything?

Cross Sectional Databases

  • 1. Cross-sectional, Single center, Form
  • 2. Cross-sectional, Multicenter, Form
  • 3. Cross-sectional, Single center, Survey
  • 4. Cross-sectional, Multicenter, Survey
  • 5. Cross-sectional, Single center, Hybrid
  • 6. Cross-sectional, Multicenter, Hybrid

Longitudinal Databases

  • 7. Longitudinal, Single center, Form, Single Arm
  • 8. Longitudinal, Multicenter, Form, Single Arm
  • 9. Longitudinal, Single center, Survey, Single Arm
  • 10. Longitudinal, Multicenter, Survey, Single Arm
  • 11. Longitudinal, Single center, Hybrid, Single Arm
  • 12. Longitudinal, Multicenter, Hybrid, Single Arm
  • 13. Longitudinal, Single center, Form, Multiarm
  • 14. Longitudinal, Multicenter, Form, Multiarm
  • 15. Longitudinal, Single center, Survey, Multiarm
  • 16. Longitudinal, Multicenter, Survey, Multiarm
  • 17. Longitudinal, Single center, Hybrid, Multiarm
  • 18. Longitudinal, Multicenter, Hybrid, Multiarm

Other use cases

  • 19. Cross-sectional, Single center, Form, Single Arm with no PHI

EDITS:

  • [2016-02-05] Removed longitudinal, single arm options...REDCap doesn't support arms in cross-sectional databases.
@wibeasley
Copy link
Member

There might be additional characteristics to cover, even if they're not fully crossed (and explode the number of combinations). Of the top of my head:

  • projects with images
  • projects with only PHI-free data
  • projects with only PHI data
  • users/tokens do/don't have permission to export PHI

@nutterb
Copy link
Contributor Author

nutterb commented Dec 9, 2015

Responding to your bullets with bullets:

  • I'm not yet persuaded that a project with images and a project without adds value. Images are a field type, and the REDCAP tools ought to be able to interface correctly regardless of whether the field is present. I would think that having the field in the test data would be more beneficial to determining if the tools can handle it correctly. (Admittedly, this may be centered on my own R experience)
  • Regarding PHI, I suspect it would be more advantageous to have all test data sets use at least one identifier/PHI field, and perhaps one additional data set with none. I don't anticipate any reason that the ability to deal with PHI would change based on project complexity.
  • I'm not sure we can define projects and data sets that deal with the user permissions. Each developer will at some point have to fiddle with permission in a project to test this. I think a discussion on this is better suited to testing guidelines than it is to data sets.

For now, I'll add a simple project with no PHI as a 25th data set, but am still willing to be persuaded to adding more.

@nutterb
Copy link
Contributor Author

nutterb commented Dec 9, 2015

Oh, I should also add I think it is important that each project include every field validation type. That may make it impossible to have a coherent, meaningful data set, but at least makes sure that every field type is read correctly.

Then again, now that I think about it, one project with each field type is probably adequate to demonstrate ability to handle the field types, and the rest of the projects would demonstrate the ability to handle the complex designs.

@wibeasley
Copy link
Member

For the sake of simplicity, I'm ok with not have with & without images. But I do think it makes sense to cover the with & without PHI scenarios. Again, this doesn't need to be a fully crossed design (ie, it wouldn't bump the cases from 24 to 48). As you suggested, @nutterb, just one or two projects should be sufficient.

If you look through the API code (eg, the export record), there's a lot of places where switches get toggled in response to things like PHI, DAGs, and user permissions. For example, the $removeDateFields toggle.

@nutterb
Copy link
Contributor Author

nutterb commented Feb 26, 2016

I've been working on a multi-center test case, and I'm beginning to think that it isn't really feasible to do multi-center test cases unless it's going to be on a shared server somewhere. The reasons why I'm becoming disillusioned with the multi-center case are

  • DAGs are assigned a group number that increments within the REDCap instance. So there's no way to standardize the redcap_data_access_group output across multiple instances.
  • A user is not able to assign him or herself to a data access group. This is problematic because the REDCap user interface never displays the group number of the data access group. Thus, in order to create any records in the groups, you need to have at least two people in the project. (unless there's some way to find the group numbers to allow for an API import...that might make it work within an instance, but still problematic across instances).

So if my assessment is correct, and we can forego the multi-center test cases and perhaps document some guidelines for multi-center testing using one of the remaining 10 test cases.

@nutterb
Copy link
Contributor Author

nutterb commented Jul 21, 2018

I just ran my first set of formal tests on redcapAPI and after just a handful of them, it's becoming pretty apparent to me that number of data sets we have assigned is definitely overkill. I think we could get by with numbers 01, 02, 03, 06, 07, 09, and 18. Beyond that, we would want one for just PHI, Randomization Module, and one with repeating forms. That would make ten total data sets.

Thoughts?

@wibeasley
Copy link
Member

Is the concern about overkill because the repo of test is awkward to manage? The API libraries (eg, redcapAPI, REDCapR, PHPCap) aren't obligated to support them all. Just the ones they want to.

@nutterb
Copy link
Contributor Author

nutterb commented Aug 5, 2018

I wasn't so concerned about the test suite being awkward as I was the unnecessary effort involved in making the projects. When I ran exportRecords on the simplest case, it produced 86% coverage. I'm just realizing that there's no need to build such a complex suite to ensure that it all works as intended.

Essentially, I'm lazy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants