Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: X has 70 features per sample; expecting 49793 #1

Open
jyotirani3020 opened this issue Aug 25, 2020 · 0 comments
Open

ValueError: X has 70 features per sample; expecting 49793 #1

jyotirani3020 opened this issue Aug 25, 2020 · 0 comments

Comments

@jyotirani3020
Copy link

jyotirani3020 commented Aug 25, 2020

I saved the classifier model created by your code. I'm Trying to predict the classes for other text but I'm little stuck. Please help.

Saving the model

import pickle
with open('/content/drive/My Drive/Text-Sum/classification_model.pkl', 'wb') as fid:
    pickle.dump(clf, fid)
text = '''
Good morning everyone. The financial results for Q1 have been reviewed by you
during the weekend and since we had a couple of days I will take a bit of a time
to explain the results also a little bit more in detail because you might have
already gone through that. So starting with the ECD segment, which has grown
by 24%, which is a followup on 30% growth in the financial year 2019. In fact in
the first quarter it was 40%. The compounded growth is thus close to 26%,
which is higher than industry. Fans have grown mid-teens while small domestic
appliances, water heaters, water purifiers have performed significantly better.
We have established a clear leadership in water heaters with impressive market
gains in small domestic appliances as well. Fans continue to grow and
consolidate its premium positioning. We feel that ECD would anchor superior
growth mantle for Havells.
'''
import pickle
loaded_model = pickle.load(open('/content/drive/My Drive/Text-Sum/classification_model.pkl', 'rb'))

tfidf_vec = TfidfVectorizer(tokenizer=preprocessing,
                            stop_words=stopwords.words('english'),
                            sublinear_tf=True,
                            use_idf=True,
                            max_df = 1, min_df = 1
                            )


X_test = tfidf_vec.fit_transform([text])
prediction = loaded_model.predict(X_test)

This is giving following error

 ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-44-3b2e1324941c> in <module>()
----> 1 prediction = loaded_model.predict(X_test)

1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_base.py in decision_function(self, X)
    271         if X.shape[1] != n_features:
    272             raise ValueError("X has %d features per sample; expecting %d"
--> 273                              % (X.shape[1], n_features))
    274 
    275         scores = safe_sparse_dot(X, self.coef_.T,

ValueError: X has 70 features per sample; expecting 49793
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant