Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor of installation #17

Merged
merged 12 commits into from
Feb 3, 2023
Merged

Conversation

dennlinger
Copy link
Contributor

Another thing that bothered me enough to open a PR is the lack of clear installation instructions (or rather, the lack of automating it).

  1. I've moved the installation instructions in the README up a bit, so it becomes more visible for first-time visitors; feel free to discuss whether this is a necessary/good change, but IMO it helps with clarity.
  2. Running __main__.py with an incorrect argument now gives an appropriate error message, instead of just "quietly finishing."
  3. Instead of having to manually download the KB first, I introduce a new catch in the DatabaseConnection: Errors that are likely caused by the lack of a KB (and thus failing to connect to it) will trigger the download of the database, and then re-try the connection. This way, even without manually running the installation process, the download will be triggered on the first time users run the library. My only concern is that sqlite3.OperationalErrors can be caused by a variety of other issues, which might make this a bad practice to download a KB each time...
  4. Finally, there are some minor nit-picks about general Python-related issues, e.g., mutable default arguments (e.g., lists).

Thanks again for the great library, hope this helps give it a bit more usability!

dennlinger and others added 10 commits February 3, 2023 16:58
1. Adds a download progress bar to get response during the call, which helps especially with slower connections to recognize timeouts etc.
2. Separates out the logic for download_knowledge_base(), which effectively allows downloads to be called from other parts of the library.
3. More expressive error messaging in case attributes passed to calls like python -m spacy_entity_linker are incorrect.
1. Automatically attempts to resolve errors that are likely caused by a missing KB; will download and then reconnect.
2. Also refactors some of the critical patterns, e.g., mutable default arguments, None-checking and use of internal attribute names (property).
Add test case to check whether empty root processing is correct.
setup.py is the default when installing from pip, instead requirements.txt is more for development requirements
@MartinoMensio
Copy link
Collaborator

Hi @dennlinger,
This is a great PR! I am reviewing it!

@MartinoMensio
Copy link
Collaborator

I think that we can simply check if the database file exists (DB_DEFAULT_PATH) and download otherwise. In this way we don't trigger another download if something else happens with sqlite3.OperationalError.
I'm testing it.

Martino

@MartinoMensio
Copy link
Collaborator

Thank you again @dennlinger for the very good contribution! Merging now!

Martino

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants