Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtracting mean embeddings #2

Open
bwang482 opened this issue Oct 30, 2019 · 2 comments
Open

Subtracting mean embeddings #2

bwang482 opened this issue Oct 30, 2019 · 2 comments

Comments

@bwang482
Copy link

bwang482 commented Oct 30, 2019

Are you sure this line is correct?
X_train = X_train - np.mean(X_train)

np.mean(X_train) gives a single value. Shouldn't it be np.mean(X_train, 0) ???

@GuilhermeZaniniMoreira
Copy link

I am gettings this error:
ValueError: operands could not be broadcast together with shapes (237191,) (300,)
There are 237.191 words with embeddins space equals to 300. How did you solve that?

@iR00i
Copy link

iR00i commented Sep 1, 2021

Shouldn't it be np.mean(X_train, 0) ???

If you go back to the original paper that proposed the "Post-Processing Algorithm" (All-but-the-Top: Simple and Effective Postprocessing for Word Representations), the authors outline computing the mean to be the following:

image

image

So i imagine the resulting mean should be a scaler computed from the entire matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants