-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretrain T2V - Width of CNN layers. #3979
Comments
@agombert This isn't very user-friendly currently, sorry. We should be inferring these settings from the pretrained file, or at least exposing a better error message. What you need to do at the moment is set the environment variable |
Hey @honnibal, Thank you for the quick answer. Actually that what I did with the alias, I put the |
Hmm. As a work-around, does it work if you also set the environment variable when you load? It should work without it, but it seems there might be a missing setting written out in the config files. |
I loaded as:
It loads, but I have 0 size vectors. EDIT: When I use the same line after the default pretrain/train with "token_vector_width":96, I have also 0 size vectors. |
Did you try setting it as an environment variable, instead of passing it in the |
I have just tried, but same mistake at each step. |
Hi, I would like to know @honnibal if you have found something about this error. Moreover, I trained a normal BERT-like model as presented in the documentation. And when I want to load it without any element of the pipeline (tagger, parser and ner), the vectors are lost and we have 0 size vectors. In fact, we have to load the tagger each time so the model provides 96 length vectors. Is it normal ? |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hello,
I tried to pretrain a model with the CNN architecture, but I would like to change the width of the CNN layer to get bigger vectors at the end (128 instead of 96).
And so I get an error about broadcast
ValueError: could not broadcast input array from shape (128) into shape (96)
. Which looks like to come from the pretraining change of the CNN parameters.How to reproduce the behaviour
I followed those steps:
1st step - W2V init
I trained on the same text corpus a W2V model, I wanted to use this W2V model as inputs to learn from it.
/w2v_vectors.txt.gz
came from gensim modeling.2nd step - model from W2V
I used the train doc to train my new model without any problem
3rd step - pretrain
I pretrained the model, as explained in the doc with the following command:
4th step - train
And well after I got my pretrain processed, I try to train from the new token2vec:
And well I get this error:
Other information about the bug:
When I use it without the
-cw 128
everything works well.Moreover, I can perform the training if I put an alias in the training such as token_vector_width=128 alias. And when I'm doing so, it looks like it's training ok but I get this error when trying to load the new t2v model:
Besides, when I disable the tagger when loading the t2v model, vectors of size 0 came from the t2v model.
Your Environment
Linux-4.9.0-7-amd64-x86_64-with-debian-buster-sid
Python 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 09:07:38)
[GCC 7.3.0]
spacy 2.1.4
gensim 3.7.2
thinc 7.0.8
The text was updated successfully, but these errors were encountered: