-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question on licensing #1865
Comments
You're correct --- we made a mistake on the Italian model. Huge thanks for calling our attention to this. Our process for propagating the license information is to add the license to a metadata file in our model building repository. This attribute is then taken when the model files are built, and shipped with the trained model. The same metadata file is consulted when the table of model files is compiled for the website. In the case of the Italian model, I copied the wrong information into the file. I was doing a number of these models at the same time, and as you note some of the UD corpora are licensed NC, while others are licensed SA. The question now is how to go about issuing the correction. I guess we post something to the spaCy list, and update the metadata? |
I'm not a specialist in such situations but it seems to me that all distributions where that Italian model is included should update that license to incorporate the NC part. And users of the model who used it for commercial purposes or redistributed it under that license would probably need information on this change. |
The new models being distributed for the v2.1 release now have updated licenses. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I have a question on licensing.
Especially when data from universaldependencies.org is used which is the case for models of Spanish, Dutch, French, Portuguese and Italian.
Am I correct that these have been built on the following corpora or are other corpora used?
I see that Spacy distributes the resulting models respectively under the following licenses:
Which seems to indicate you keep the same license as the database used to train the model upon except for Italian.
Shouldn't this Italian model also be licensed under the CC-BY-NC-SA 3.0 Unported license.
I'm copying some comments from the creative commons website: https://wiki.creativecommons.org/wiki/data#Can_I_conduct_text.2Fdata_mining_on_a_CC-licensed_database.3F which indicates that, although mostly related to CC-BY 4.0 licenses.
Either way the non-commercial part seems to indicate that redistributing under the less restrictive CC BY-SA 4.0 would be not allowed.
Can I conduct text/data mining on a CC-licensed database?
Or was another corpus used for creating the Italian model?
The text was updated successfully, but these errors were encountered: