Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Language Model Download Speed #798

Closed
darcyosullivan opened this issue Feb 2, 2017 · 12 comments
Closed

Slow Language Model Download Speed #798

darcyosullivan opened this issue Feb 2, 2017 · 12 comments
Labels
install Installation issues

Comments

@darcyosullivan
Copy link

Im currently downloading the language Models using

python3 -m spacy.en.download all

The download speed is less then 0.00MB/S and there is no issue with my internet connection, any suggestions?

Your Environment

  • Operating System: OSX
  • Python Version Used: 3.4
  • spaCy Version Used:
  • Environment Information:
@honnibal
Copy link
Member

honnibal commented Feb 2, 2017

Reconnect?

@darcyosullivan
Copy link
Author

I've tried reconnecting to my internet several times swell as restarting the download

@patxu
Copy link

patxu commented Feb 2, 2017

i had this error, mine is flucuating between 0.01 and 0.03 mbs

@jason-feng
Copy link

im having this issue using python2.7

@honnibal
Copy link
Member

honnibal commented Feb 2, 2017

Hmm. It's working fine for me, currently. Is it still not working for you guys?

The model data is hosted on S3. We're looking forward to changing this, as it's expensive and not great overall.

@honnibal honnibal added the install Installation issues label Feb 2, 2017
@jason-feng
Copy link

It got better for me and is working now, but for an hour or two, it was terribly slow

@ranka47
Copy link

ranka47 commented Feb 7, 2017

I too had this problem while I was working through my campus LAN network.
The moment I switched to open network provided by the service provider on the mobile phone the speed was great.

@xushenkun
Copy link

Same problem. Even using VPN to download from S3, It's still very slow.

@eromoe
Copy link

eromoe commented Feb 15, 2017

Slow in China too, though I konw it is due to GFW...

Downloading parsing model
Downloading...
Downloaded 1.09MB 0.21% 0.02MB/s eta 504m 29s

@honnibal Maybe you can provide the directly download link in docs, so we can download that through browser with proxy , it is the simpliest way since support download over proxy in python need some modification which may be buggy (https, cert or something else, which does happen in nltk).

Additional suggestion, without call spacy.util.set_data_path(),
it is better to tell spacy where the data locate by environment variable.

@ines
Copy link
Member

ines commented Feb 16, 2017

After trying out different solutions, we decided to simply attach the models as archive files to the latest release. The files are still quite large (between 500 and ~700MB), but they can now be downloaded via the browser. The English and German models are already uploaded and I'm currently waiting for the GloVe vectors to finish (which is the largest file).

➡️ https:/explosion/spaCy/releases/tag/v1.6.0

Here's how to install the models manually:

  1. Find the default data path. Use spacy.util.get_data_path() to find the directory where spaCy will look for its models, or change the default data path with spacy.util.set_data_path().
  2. Unpack the archive and place the contained folder in that directory.
  3. Load the model via spacy.load('en') or spacy.load('de').

cc: @darcyosullivan, @patxu, @jason-feng, @ranka47, @xushenkun, @eromoe

@de-code
Copy link

de-code commented Feb 5, 2018

I'm just wondering whether there is a reason why you wouldn't want python -m spacy.en.download all to download from GitHub instead rather than requiring the manual download?

@lock
Copy link

lock bot commented May 8, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
install Installation issues
Projects
None yet
Development

No branches or pull requests

9 participants