Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TextCatBOW to use the fixed SparseLinear layer #13149

Merged
merged 4 commits into from
Nov 29, 2023

Commits on Nov 23, 2023

  1. Update TextCatBOW to use the fixed SparseLinear layer

    A while ago, we fixed the `SparseLinear` layer to use all available
    parameters: explosion/thinc#754
    
    This change updates `TextCatBOW` to `v3` which uses the new
    `SparseLinear_v2` layer. This results in a sizeable improvement on a
    text categorization task that was tested.
    
    While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
    option to make it possible to change the hidden size. Ideally, we'd just
    have an option called `length`. But the way that `TextCatBOW` uses
    hashes results in a non-uniform distribution of parameters when the
    length is not a power of two.
    danieldk committed Nov 23, 2023
    Configuration menu
    Copy the full SHA
    0f4920d View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2023

  1. Replace TexCatBOW length_exponent parameter by length

    We now round up the length to the next power of two if it isn't
    a power of two.
    danieldk committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    d865f9b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7d23caf View commit details
    Browse the repository at this point in the history

Commits on Nov 28, 2023

  1. Fix missing import

    danieldk committed Nov 28, 2023
    Configuration menu
    Copy the full SHA
    4f18f31 View commit details
    Browse the repository at this point in the history