Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proper mappings management #2361

Merged
merged 9 commits into from
Apr 27, 2024
Merged

Add proper mappings management #2361

merged 9 commits into from
Apr 27, 2024

Conversation

gouttegd
Copy link
Collaborator

This PR implements the CL side of the proposition outline in #2358.

That is, it defines 4 SSSOM mapping sets (declared in the ODK, but as “manually maintained”, so all the code to produce them is in the custom Makefile):

  • fbbt.sssom.tsv: This is basically a local mirror of the FBbt mapping set, except that it only contains mappings relevant to CL (FBbt-to-CL mappings; all FBbt-to-something-else mappings are removed).
  • zfa.sssom.tsv: A mapping set obtained by extracting the CL-pointing cross-references in ZFA (ZFA is the source of truth for the CL-ZFA mappings).
  • cl-local.sssom.tsv: A mapping set obtained by extracting cross-references from the CL edit file, for all foreign ontologies for which CL is the source of truth of mappings (all of them except FBbt and ZFA).
  • cl.sssom.tsv: A “meta” set made by combining all three sets above. This is the set that will ultimately be published.

The FBbt and ZFA sets are committed to the repository (in src/mappings rather than src/ontology/mappings, to align CL with the recommendations of the ODK) since they are dependent on remote resources. This is the same logic as the one we use in Uberon.

The CL-local and CL sets are not committed since they can always be re-generated as needed from files that are present in any copy of the repository.

The FBbt and ZFA sets are also used to populate the mappings.owl component, which contains the FBbt and ZFA mappings as old-style cross-references for backwards compatibility (previously only FBbt mappings were present in that component).

Declare the two SSSOM mapping sets that CL will be using to the ODK:

* the CL-local mapping set, which will be created from xrefs in the
  ontology;
* the FBbt mapping set, which will be fetched "as is" from FBbt;
* the ZFA mapping set, which will be extracted from ZFA xrefs;
* the CL "meta" mapping set, which will be made by combining all three
  sets above -- that's the set that will be published.
Add all the rules to generate the SSSOM files:

* generate cl-local.sssom.tsv by extracting the xrefs from the -edit
  file;
* generate fbbt.sssom.tsv by fetching the original FBbt mapping set and
  filtering it to keep only the CL mappings;
* generate zfa.sssom.tsv by fetching the ZFA ontology and extracting the
  CL xrefs;
* generate cl.sssom.tsv by combining all three sets into one.

The FBbt and ZFA rules are conditioned on IMP being set to true, since
they fetch data from remote sources.
Replace the AWK script that was used to generate the mappings.owl
component by a SSSOM/T-OWL ruleset similar to the one used in Uberon.

Also, generate the component from the mappings from *all* the remote
sets (FBbt + ZFA) instead of the FBbt set only.
All mapping sets will from now on be in the src/mappings directory,
which is the directory recommended by the ODK.
The cl-local mapping set only depends on the -edit file, so it can
always be re-generated from a fresh copy of the repository without
needing to download external resources -- therefore there's no need to
commit it to the repository.

Likewise for the "meta" set.
When extracting cross-references from CL itself (to generate the
cl-local mapping set) and from ZFA (to generate the zfa mapping set),
assign to each set an explicit mapping set ID.
Add the "meta" mapping set (src/mappings/cl.sssom.tsv) to the list of
files to upload when performing a release.
Mapping sets that are obtained from a remote location are committed to
the repository, so that CL can be built directly from a fresh copy of
the repository without needing to access any remote resources.
@gouttegd gouttegd self-assigned this Apr 26, 2024
@gouttegd gouttegd linked an issue Apr 26, 2024 that may be closed by this pull request
$(MAPPINGDIR)/zfa.sssom.tsv:
test -f $@

validate_mappings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside Seems we should add this to test: make goal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; we can do that when we’ll fix the other SSSOM-related issues in the ODK.

- id: cl-local
maintainance: manual
- id: cl
maintainance: manual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the two mappings sets are not checked in - intended?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FBbt and ZFA sets are committed to the repository (in src/mappings rather than src/ontology/mappings, to align CL with the recommendations of the ODK) since they are dependent on remote resources. This is the same logic as the one we use in Uberon.

The CL-local and CL sets are not committed since they can always be re-generated as needed from files that are present in any copy of the repository.

@matentzn
Copy link
Contributor

Awwesome, THANKS!

@gouttegd gouttegd merged commit dc021dc into master Apr 27, 2024
1 check passed
@gouttegd gouttegd deleted the manage-mappings branch April 27, 2024 19:29
gouttegd added a commit that referenced this pull request Apr 29, 2024
The rule that extracts the cross-references from the local mirror of ZFA
to build the ZFA-to-CL mapping set is using the SSSOM plugin, so we must
make sure the plugin is ready to use (in `$(TMPDIR)/plugins`) before we
reach that rule. The normal way to do that is to depend on the
'all_robot_plugins' target, which takes care of installing all plugins.

This was missed when preparing #2361 because running the entire pipeline
causes the plugins to be installed anyway before we reach the
zfa.sssom.tsv rule. But this may not be the case when trying to build
individual products.
gouttegd added a commit that referenced this pull request Apr 29, 2024
The rule that extracts the cross-references from the local mirror of ZFA
to build the ZFA-to-CL mapping set is using the SSSOM plugin, so we must
make sure the plugin is ready to use (in `$(TMPDIR)/plugins`) before we
reach that rule. The normal way to do that is to depend on the
'all_robot_plugins' target, which takes care of installing all plugins.

This was missed when preparing #2361 because running the entire pipeline
causes the plugins to be installed anyway before we reach the
zfa.sssom.tsv rule. But this may not be the case when trying to build
individual products.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Let CL manage its own mappings
2 participants