Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ParametricAttention.v2 #913

Merged
merged 7 commits into from
Dec 14, 2023

Conversation

danieldk
Copy link
Contributor

Description

This layer is an extension of the existing ParametricAttention layer, adding support for transformations (such as a non-linear layer) of the key representation. This brings the model closer to the paper that suggested it (Yang et al, 2016) and gave slightly better results in experiments.

Types of change

Feature

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

Copy link

netlify bot commented Dec 12, 2023

👷 Deploy request for thinc-ai pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 80f47b7

This layer is an extension of the existing `ParametricAttention` layer,
adding support for transformations (such as a non-linear layer) of the
key representation. This brings the model closer to the paper that
suggested it (Yang et al, 2016) and gave slightly better results in
experiments.
@danieldk danieldk force-pushed the feature/parametric-attention-v2 branch from 72eca0d to a2e178f Compare December 12, 2023 09:33
@danieldk danieldk added enhancement Feature requests and improvements feat / layers Weights layers, transforms, combinators, wrappers labels Dec 12, 2023
@danieldk danieldk marked this pull request as ready for review December 12, 2023 10:29
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I only had a few concerns about robustness & tests.

thinc/tests/layers/test_layers_api.py Outdated Show resolved Hide resolved
thinc/tests/layers/test_layers_api.py Show resolved Hide resolved
thinc/layers/parametricattention_v2.py Outdated Show resolved Hide resolved
thinc/layers/parametricattention_v2.py Outdated Show resolved Hide resolved
Co-authored-by: Adriane Boyd <[email protected]>
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely much cleaner implementation with the noop layer. Looks good to merge!

@danieldk danieldk merged commit 88dc49d into explosion:master Dec 14, 2023
10 checks passed
@danieldk danieldk deleted the feature/parametric-attention-v2 branch December 14, 2023 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements feat / layers Weights layers, transforms, combinators, wrappers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants