Contextual Word Embeddings Augmenter (BERT) error #161

rajae-Bens · 2020-10-07T17:36:10Z

Hi,

I am getting this error
NameError: name 'AutoTokenizer' is not defined

when trying this code

Augment French by BERT

aug = naw.ContextualWordEmbsAug(model_path='bert-base-multilingual-uncased', aug_p=0.1)
text = "Bonjour, J'aimerais une attestation de l'employeur certifiant que je suis en CDI."
augmented_text = aug.augment(text)
print("Original:")
print(text)
print("Augmented Text:")
print(augmented_text)

I installed transformers and imported AutoTokenizer
but I am still getting the same error
Any ideas plz
thank u

makcedward · 2020-10-08T02:22:15Z

May you share a version of python, transformer and nlpaug?

rajae-Bens · 2020-10-08T06:29:28Z

Hi,

Python 3.6.9
nlpaug-1.0.1
transformers-3.3.1

makcedward · 2020-10-09T02:49:05Z

how about PyTorch version? Suggest to install 1.6 version

narayanacharya6 · 2020-10-21T23:47:52Z

Getting the same error for:
Python 3.6.9
nlpaug 1.0.1
transformers 3.4.0
torch 1.6.0+cu101

Code Snippet:

text = 'The quick brown fox jumps over the lazy dog'
augInsert = naw.ContextualWordEmbsAug(model_path='bert-base-uncased', action="insert")

Error trace:

NameError                                 Traceback (most recent call last)
<ipython-input-11-6c148101e32b> in <module>()
      1 text = 'The quick brown fox jumps over the lazy dog'
----> 2 augInsert = naw.ContextualWordEmbsAug(model_path='bert-base-uncased', action="insert")
      3 augSubstitute = naw.ContextualWordEmbsAug(model_path='bert-base-uncased', action="substitute")
      4 augmented_text1 = augInsert.augment(text)
      5 augmented_text2 = augSubstitute.augment(text)

3 frames
/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in __init__(self, model_path, action, temperature, top_k, top_p, name, aug_min, aug_max, aug_p, stopwords, device, force_reload, optimize, stopwords_regex, verbose, silence)
     97         self.model = self.get_model(
     98             model_path=model_path, device=device, force_reload=force_reload, temperature=temperature, top_k=top_k,
---> 99             top_p=top_p, optimize=optimize, silence=silence)
    100         # Override stopwords
    101         if stopwords is not None and self.model_type in ['xlnet', 'roberta']:

/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in get_model(cls, model_path, device, force_reload, temperature, top_k, top_p, optimize, silence)
    423     def get_model(cls, model_path, device='cuda', force_reload=False, temperature=1.0, top_k=None, top_p=0.0,
    424                   optimize=None, silence=True):
--> 425         return init_context_word_embs_model(model_path, device, force_reload, temperature, top_k, top_p, optimize, silence)

/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in init_context_word_embs_model(model_path, device, force_reload, temperature, top_k, top_p, optimize, silence)
     31         model = nml.Roberta(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p, silence=silence)
     32     elif 'bert' in model_path.lower():
---> 33         model = nml.Bert(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p, silence=silence)
     34     elif 'xlnet' in model_path.lower():
     35         model = nml.XlNet(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p, optimize=optimize,

/usr/local/lib/python3.6/dist-packages/nlpaug/model/lang_models/bert.py in __init__(self, model_path, temperature, top_k, top_p, device, silence)
     30         self.model_path = model_path
     31 
---> 32         self.tokenizer = AutoTokenizer.from_pretrained(model_path)
     33         self.mask_id = self.token2id(self.MASK_TOKEN)
     34         self.pad_id = self.token2id(self.PAD_TOKEN)

NameError: name 'AutoTokenizer' is not defined

narayanacharya6 · 2020-10-22T00:03:16Z

If it helps, doing this in the same notebook works:

from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

Not sure what the underlying problem could be.

makcedward · 2020-10-24T06:20:17Z

@narayanacharya6
I cannot reproduce this issue. I changed library import statements, see if it helped. Does not push pip yet, only available on master branch.

narayanacharya6 · 2020-10-24T18:29:04Z

Probably something off with Google Colab. I tried to run it locally on my machine and it works just fine.
I think this issue can be closed.

makcedward added a commit that referenced this issue Oct 24, 2020

re-import transformers library for #161

f1b89c8

makcedward closed this as completed Oct 24, 2020

makcedward mentioned this issue Oct 24, 2020

unbale to use Xlnet model for augment ! #164

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contextual Word Embeddings Augmenter (BERT) error #161

Contextual Word Embeddings Augmenter (BERT) error #161

rajae-Bens commented Oct 7, 2020

makcedward commented Oct 8, 2020

rajae-Bens commented Oct 8, 2020

makcedward commented Oct 9, 2020

narayanacharya6 commented Oct 21, 2020

narayanacharya6 commented Oct 22, 2020

makcedward commented Oct 24, 2020

narayanacharya6 commented Oct 24, 2020

Contextual Word Embeddings Augmenter (BERT) error #161

Contextual Word Embeddings Augmenter (BERT) error #161

Comments

rajae-Bens commented Oct 7, 2020

Augment French by BERT

makcedward commented Oct 8, 2020

rajae-Bens commented Oct 8, 2020

makcedward commented Oct 9, 2020

narayanacharya6 commented Oct 21, 2020

narayanacharya6 commented Oct 22, 2020

makcedward commented Oct 24, 2020

narayanacharya6 commented Oct 24, 2020