-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broaden set of theses returned by not_consented_to_proquest scope #1109
Conversation
Why these changes are being introduced: This is a similar issue to the one solved in #1108, but in the not_consented_to_proquest scope. Here, the scope excludes all theses with an author ProQuest consent value of 'true'. This works in most cases, except when one author has opted in and another hasn't, that thesis should still be partially harvested. With the current logic, it will not show up in the export preview because the scope sees a 'true' consent value and thus excludes the thesis. Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/ETD-603 How this addresses that need: This inverts the logic of the not_consented_to_proquest scope. Instead of excluding theses with at least one ProQuest consent value of true, it includes theses with at least one ProQuest consent value of false or nil, using the 'or' method to return the union. Side effects of this change: Since 'includes' is not compatible with the 'or' method, the scope now uses 'left_joins' to load the author records. This also means that the the result will be an ActiveRecord assocation rather than an array. ActiveJob does not accept associations as params, so we are now casting the params as arrays. I've done with this with both the full and partial export sets to avoid confusion, even though it's only needed at this stage for the partial export set.
ff9ed87
to
85ffa17
Compare
This bug should have been caught by our test suite, but there's a typo in the assertion that would've caught it, such that it instead looks for the opposite condition: https:/MITLibraries/thing/pull/1109/files#diff-d75b7faadae90a81e1e198252235e1650115d699a0291df2cb913164f25a797aL1325-L1328 This is now fixed, and I think all other edge cases are tested, but I'd like the reviewer to check my work on that as I should've caught this earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logic is fine. We may benefit from reusing theses state less in assertions, but that is not blocking just something to ponder.
thesis = theses(:two) | ||
assert_not_includes Thesis.not_consented_to_proquest, thesis | ||
# multi-author thesis with one opt-in and one null is included | ||
assert_equal 2, thesis.authors.count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like we are inheriting the thesis from the pervious set of assertions and not loading one intentionally. The test is likely valid as-is, but if someone inserted a different set of assertions between the previous and this one that loaded or changed state of a thesis/authors we'd be confused here. I'd suggest we load a fixture for a thesis for each case we are testing even if we end up reloading the same fixture multiple times. The prevent this potential leakage, it may (or may not!) be useful to use more explicit variable names than thesis
. It's really easy to let data slip between assertions when we reuse the variable name (which I think we do quite frequently and is not limited to this work but I'm only noticing it now as a potential for introducing bugs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JPrevost I totally hear you. In this case, the impetus was expediency. If I were to add a couple fixtures (or even change existing ones), I would have to update several tests in the report model that expect specific numbers of entries in a table. I fixed a few of those tests the last time around out of frustration, but there are still a handful of them, and my goal was to land this as quickly as possible.
Your point is well taken, though, and I opened this ticket to fix it up: https://mitlibraries.atlassian.net/browse/ETD-621
I also opened this ticket to refactor the bad tests in the report model: https://mitlibraries.atlassian.net/browse/ETD-620
Why these changes are being introduced:
This is a similar issue to the one solved in #1108, but in the not_consented_to_proquest scope. Here, the scope excludes all theses with an author ProQuest consent value of 'true'. This works in most cases, except when one author has opted in and another hasn't, that thesis should still be partially harvested. With the current logic, it will not show up in the export preview because the scope sees a 'true' consent value and thus excludes the thesis.
Relevant ticket(s):
How this addresses that need:
This inverts the logic of the not_consented_to_proquest scope. Instead of excluding theses with at least one ProQuest consent value of true, it includes theses with at least one ProQuest consent value of false or nil, using the 'or' method to return the union.
Side effects of this change:
Since 'includes' is not compatible with the 'or' method, the scope now uses 'left_joins' to load the author records. This also means that the the result will be an ActiveRecord assocation rather than an array. ActiveJob does not accept associations as params, so we are now casting the params as arrays. I've done with this with both the full and partial export sets to avoid confusion, even though it's only needed at this stage for the partial export set.
Developer
our guide and
all issues introduced by these changes have been resolved or opened as new
issues (link to those issues in the Pull Request details above)
Code Reviewer
(not just this pull request message)
Requires database migrations?
NO
Includes new or updated dependencies?
NO