Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add floret Wikipedia+OSCAR vectors project #99

Merged
merged 16 commits into from
Apr 4, 2022

Conversation

adrianeboyd
Copy link
Contributor

Add a user-friendly cross-platform floret vectors project for training on Wikipedia (using wikiextractor) + OSCAR (using datasets).

The project defaults use Macedonian to provide a small but realistic demo.

The tests override the defaults to use the much-smaller Yoruba data sources.

@adrianeboyd
Copy link
Contributor Author

This will require thinc v8.0.15 with explosion/thinc#610.

@explosion explosion deleted a comment from MariamRiaz Mar 15, 2022
@adrianeboyd adrianeboyd reopened this Mar 15, 2022
@adrianeboyd adrianeboyd reopened this Mar 16, 2022
@adrianeboyd adrianeboyd marked this pull request as draft March 16, 2022 14:45
@adrianeboyd adrianeboyd reopened this Mar 16, 2022
@adrianeboyd adrianeboyd marked this pull request as ready for review March 23, 2022 10:28
@svlandeg svlandeg added the enhancement New feature or request label Mar 29, 2022
@@ -31,12 +31,15 @@ jobs:
architecture: 'x64'

- script: |
pip install "spacy>=3.1.0,<3.2.0"
pip install "spacy>=3.2.0,<3.3.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there will be something breaking in 3.3? Considering that release is near, it might be nice to make sure it's supported too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but maybe in a separate PR after v3.3 is released?

@adrianeboyd adrianeboyd reopened this Mar 30, 2022
@adrianeboyd adrianeboyd merged commit baf0aa6 into explosion:v3 Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants