-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proper mappings management #2361
Conversation
Declare the two SSSOM mapping sets that CL will be using to the ODK: * the CL-local mapping set, which will be created from xrefs in the ontology; * the FBbt mapping set, which will be fetched "as is" from FBbt; * the ZFA mapping set, which will be extracted from ZFA xrefs; * the CL "meta" mapping set, which will be made by combining all three sets above -- that's the set that will be published.
Add all the rules to generate the SSSOM files: * generate cl-local.sssom.tsv by extracting the xrefs from the -edit file; * generate fbbt.sssom.tsv by fetching the original FBbt mapping set and filtering it to keep only the CL mappings; * generate zfa.sssom.tsv by fetching the ZFA ontology and extracting the CL xrefs; * generate cl.sssom.tsv by combining all three sets into one. The FBbt and ZFA rules are conditioned on IMP being set to true, since they fetch data from remote sources.
Replace the AWK script that was used to generate the mappings.owl component by a SSSOM/T-OWL ruleset similar to the one used in Uberon. Also, generate the component from the mappings from *all* the remote sets (FBbt + ZFA) instead of the FBbt set only.
All mapping sets will from now on be in the src/mappings directory, which is the directory recommended by the ODK.
The cl-local mapping set only depends on the -edit file, so it can always be re-generated from a fresh copy of the repository without needing to download external resources -- therefore there's no need to commit it to the repository. Likewise for the "meta" set.
When extracting cross-references from CL itself (to generate the cl-local mapping set) and from ZFA (to generate the zfa mapping set), assign to each set an explicit mapping set ID.
Add the "meta" mapping set (src/mappings/cl.sssom.tsv) to the list of files to upload when performing a release.
Mapping sets that are obtained from a remote location are committed to the repository, so that CL can be built directly from a fresh copy of the repository without needing to access any remote resources.
$(MAPPINGDIR)/zfa.sssom.tsv: | ||
test -f $@ | ||
|
||
validate_mappings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside Seems we should add this to test: make goal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; we can do that when we’ll fix the other SSSOM-related issues in the ODK.
- id: cl-local | ||
maintainance: manual | ||
- id: cl | ||
maintainance: manual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the two mappings sets are not checked in - intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FBbt and ZFA sets are committed to the repository (in src/mappings rather than src/ontology/mappings, to align CL with the recommendations of the ODK) since they are dependent on remote resources. This is the same logic as the one we use in Uberon.
The CL-local and CL sets are not committed since they can always be re-generated as needed from files that are present in any copy of the repository.
Awwesome, THANKS! |
The rule that extracts the cross-references from the local mirror of ZFA to build the ZFA-to-CL mapping set is using the SSSOM plugin, so we must make sure the plugin is ready to use (in `$(TMPDIR)/plugins`) before we reach that rule. The normal way to do that is to depend on the 'all_robot_plugins' target, which takes care of installing all plugins. This was missed when preparing #2361 because running the entire pipeline causes the plugins to be installed anyway before we reach the zfa.sssom.tsv rule. But this may not be the case when trying to build individual products.
The rule that extracts the cross-references from the local mirror of ZFA to build the ZFA-to-CL mapping set is using the SSSOM plugin, so we must make sure the plugin is ready to use (in `$(TMPDIR)/plugins`) before we reach that rule. The normal way to do that is to depend on the 'all_robot_plugins' target, which takes care of installing all plugins. This was missed when preparing #2361 because running the entire pipeline causes the plugins to be installed anyway before we reach the zfa.sssom.tsv rule. But this may not be the case when trying to build individual products.
This PR implements the CL side of the proposition outline in #2358.
That is, it defines 4 SSSOM mapping sets (declared in the ODK, but as “manually maintained”, so all the code to produce them is in the custom Makefile):
fbbt.sssom.tsv
: This is basically a local mirror of the FBbt mapping set, except that it only contains mappings relevant to CL (FBbt-to-CL mappings; all FBbt-to-something-else mappings are removed).zfa.sssom.tsv
: A mapping set obtained by extracting the CL-pointing cross-references in ZFA (ZFA is the source of truth for the CL-ZFA mappings).cl-local.sssom.tsv
: A mapping set obtained by extracting cross-references from the CL edit file, for all foreign ontologies for which CL is the source of truth of mappings (all of them except FBbt and ZFA).cl.sssom.tsv
: A “meta” set made by combining all three sets above. This is the set that will ultimately be published.The FBbt and ZFA sets are committed to the repository (in
src/mappings
rather thansrc/ontology/mappings
, to align CL with the recommendations of the ODK) since they are dependent on remote resources. This is the same logic as the one we use in Uberon.The CL-local and CL sets are not committed since they can always be re-generated as needed from files that are present in any copy of the repository.
The FBbt and ZFA sets are also used to populate the
mappings.owl
component, which contains the FBbt and ZFA mappings as old-style cross-references for backwards compatibility (previously only FBbt mappings were present in that component).