Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for Entity Linking #4065

Merged
merged 53 commits into from
Sep 12, 2019
Merged

Conversation

svlandeg
Copy link
Member

@svlandeg svlandeg commented Aug 1, 2019

Documentation for PRs #3459, #3864 and PR #4003 - all referring to the Entity Linking functionality discussed in Issue #3339.

Added documentation

API

  • NEW: kb.md (for KnowledgeBase in kb.pyx)
  • NEW: candidate.md (for Candidate in kb.pyx)
  • NEW: entitylinker.md (for EntityLinker in pipes.pyx)
  • UPDATED: span.md, added kb_id attribute
  • UPDATED: token.md, added ent_kb_id attribute
  • UPDATED: goldparse.md, added links attribute
  • UPDATED: entityrecognizer.md, textcategorizer.md and tagger.md, with more explicit references to tensors and optimizer.averages (to keep the docs consistent across pipe components)

Usage

  • UPDATED: 101/_named_entities.md and 101/_pipelines.md, added explanations around KB ID fields
  • UPDATED: processing-pipelines.md, added entity_linker pipe explanation
  • UPDATED: spacy-101.md, added Entity Linking & Knowledge Base definitions
  • UPDATED: training.md with information on training the entity linker + references to the two generic example scripts
  • UPDATED: linguistic-features.md, added explanations around Entity Linking + example code
  • UPDATED: facts-figures.md, set support to Entity Linking to True !

TODO before merging

  • ctrl-F "X.X" and replace with actual version number. DONE
  • Make sure the pretrained models include the entity_linking pipe so the code snippets actually work. Some of them currently don't, I tested them on custom models instead. The example scripts should work though because they start with a blank model.
  • Update language pipe overview picture and add EL functionality ?

Types of change

Changes to the documentation

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

# Conflicts:
#	examples/pipeline/wikidata_entity_linking.py
#	spacy/pipeline/pipes.pyx
@svlandeg svlandeg added docs Documentation and website feat / nel Feature: Named Entity linking ⚠️ wip Work in progress labels Aug 1, 2019
@svlandeg
Copy link
Member Author

svlandeg commented Aug 6, 2019

Update: filtered this PR down and moved all functionality changes to PR #4091 --> once the latter one is merged, will merge into this to clean this up.
[EDIT: done]

@ines ines added the v2.2 label Aug 12, 2019
# Conflicts:
#	bin/wiki_entity_linking/train_descriptions.py
#	examples/pipeline/dummy_entity_linking.py
#	examples/pipeline/wikidata_entity_linking.py
#	examples/training/pretrain_kb.py
#	spacy/errors.py
website/docs/api/candidate.md Outdated Show resolved Hide resolved
@svlandeg
Copy link
Member Author

@ines : thanks for the review! I think I addressed everything. I was also wondering whether I should add anything to the Cython documentation (strucs & classes), or do we consider the KB too specific to be included there?

@ines
Copy link
Member

ines commented Aug 22, 2019

I was also wondering whether I should add anything to the Cython documentation (strucs & classes), or do we consider the KB too specific to be included there?

I think for now, it's okay to not document the Cython part and consider this internals. There's a chance that some of the specifics are still going to change. The main focus of the Cython docs are the data structures that we want users to interact with directly from their Cython implementations.

@svlandeg svlandeg removed the ⚠️ wip Work in progress label Sep 2, 2019
@ines ines changed the base branch from master to develop September 12, 2019 09:37
@ines ines merged commit 0b4b4f1 into explosion:develop Sep 12, 2019
@svlandeg svlandeg deleted the feature/el-docs branch September 12, 2019 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and website feat / nel Feature: Named Entity linking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants