Skip to content

Latest commit

 

History

History
34 lines (29 loc) · 2.57 KB

gorule-0000027.md

File metadata and controls

34 lines (29 loc) · 2.57 KB
layout id title type fail_mode status contact implementations
rule
GORULE:0000027
Each identifier in GAF is valid
report
soft
implemented
  • DB (GAF and GPAD column 1); and all DB abbreviations in 'with' field (GAF column 8; GPAD column 7) and in the annotation extensions (GAF column 16; GPAD column 11) must be in db-xrefs.yaml (see below)
  • id_syntax information in the db-xrefs.yaml file can be used to validate local identifiers.
  • The 'with' field can either contain GO terms, when the Evidence code is IC. GO terms are checked in GORULE:0000001, or DB:ID, which are checked as Columns 1 & 2.
  • The assigned_by field (GAF column 15; GPAD column 10) is checked against groups.yaml
  • The 'extension' field (GAF column 16; GPAD column 11) can either contain GO terms, or DB:ID, which are checked as Columns 1 & 2.
  • TBC (this may be GORULE:0000001) All GO IDs must be extant in current ontology: GO IDs can be present in Columns 5, 8, and 16 of GAF (4, 7, 11 in GPAD).

Additional notes on identifiers

In GAF and GPAD, the identifier is represented using two fields, column 1 is the prefex (DB), and column 2 is the local identifier. The global id is formed by concatenating these with :. In all other fields, such as the "With/from" field, the reference, the extensions, a global ID is specified, which MUST always be prefixed; i. e. contain a namespace and an identifier, separated by a colon.

In all cases, the prefix MUST be in db-xrefs.yaml. The prefix SHOULD be identical (case-sensitive match) to the database field.

When consuming GAF files, programs SHOULD repair by replacing prefix synonyms with the canonical form, in addition to reporting on the mismatch. For example, as part of the association file release the submitted files should swap out legacy uses of 'UniProt' with 'UniProtKB'.

Reference formatting must be correct

References in the GAF (Column 6) should be of the format db_name:db_key. Multiple values can be pipe-separated, e.g. SGD_REF:S000047763|PMID:2676709. PMID, DOIs, Agricola, GO_REF and internal MOD references are allowed.