Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

jsnfly · 2022-02-06T11:29:30Z

Environment info

transformers version: 4.17.0.dev0
Platform: Linux-5.13.0-27-generic-x86_64-with-glibc2.34
Python version: 3.9.7
PyTorch version (GPU?): 1.10.1+cu102 (True)
Tensorflow version (GPU?): 2.7.0 (False)
Flax version (CPU?/GPU?/TPU?): 0.3.6 (cpu)
Jax version: 0.2.26
JaxLib version: 0.1.75
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help

Information

In EncoderDecoder models one can pass encoder_outputs as a tuple of Tensors . However, if you do that this line will fail with

AttributeError: 'tuple' object has no attribute 'last_hidden_state'

since the tuple isn't modified in the forward method.
So if it is a tuple, encoder_outputs could maybe wrapped in a ModelOutput class or something similar. Or handle the tuple somehow explicitly.

On a slight tangent

I made a SpeechEncoderDecoderModel for the robust speech challenge: https://huggingface.co/jsnfly/wav2vec2-large-xlsr-53-german-gpt2. I found that adding the position embeddings of the decoder model to the outputs of the encoder model improved performance significantly (basically didn't work without it).
This needs small modifications to the __init__ and forward methods of the SpeechEncoderDecoderModel.

At the moment this seems to me too much of a "hack" to add it to the SpeechEncoderDecoderModel class generally (for example via a flag), because it may differ for different decoder models and probably also needs more verification. @patrickvonplaten showed some interest that this could be included in Transformers nonetheless. What do you think?

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2022-04-06T16:10:46Z

Hey @jsnfly,

Regarding the first point - agree, it'd be good to check if the input is a tuple and if it is we can wrap it into a ModelOutput object. Would you be interested in opening a PR for this? :-)

Regarding the 2nd point - that's very interesting (cc @sanchit-gandhi). Also makes a lot of sense since ASR by itself is monotonic so knowing the order of words to transcribe together with the encoder speech frames seems like a sensible design architecture. Thanks a lot for sharing this here!

sanchit-gandhi · 2022-04-06T16:57:35Z

The embedding hack is a really neat find - nice one @jsnfly! It's something we're going to take a look into in our ASR experiments! It seems like it could help with alignment in a much cleaner and more compact way than the encoder-decoder cross-attention mechanism.

jsnfly · 2022-04-18T08:22:28Z

Regarding the first point - agree, it'd be good to check if the input is a tuple and if it is we can wrap it into a ModelOutput object. Would you be interested in opening a PR for this? :-)

I have opened one - feel free to take a look.

Regarding the 2nd point - that's very interesting (cc @sanchit-gandhi). Also makes a lot of sense since ASR by itself is monotonic so knowing the order of words to transcribe together with the encoder speech frames seems like a sensible design architecture. Thanks a lot for sharing this here!

The embedding hack is a really neat find - nice one @jsnfly! It's something we're going to take a look into in our ASR experiments! It seems like it could help with alignment in a much cleaner and more compact way than the encoder-decoder cross-attention mechanism.

Thanks for your feedback :) I will also try to experiment with this a bit more and let you know if I get some more results.

espoirMur · 2022-06-25T06:39:41Z

@jsnfly , thank you for this PR... Is it possible to do this fix for a T5 model as well.. It is also a sequence to sequence model and sometime we may want to pass a tuple to the decoder.

If you guys don't see any issue I can do that.

For context, I am playing with the Fusion In Decoder, which is a version of the T5 model.

The encoder, a tuple which is the hidden state of all encoder blocks concatenated as one vector, but the code is failing because it is expecting a tuple.

I am going to apply this fix to the T5 model locally and see how it behaves..

@patrickvonplaten, let me know what you think ..

ZIZUN · 2022-07-31T10:49:14Z

@espoirMur
FID's requirement is transformers 3.0.2
so, this version's model output is formed as tuple.
you can fix this issue if add input variable 'return_dict=False' at model input. (on transformers latest version)
as follow
train_loss = model( input_ids=context_ids.cuda(), attention_mask=context_mask.cuda(), labels=labels.cuda(), return_dict=False )[0]

Baxkiller · 2023-03-23T11:24:27Z

@espoirMur FID's requirement is transformers 3.0.2 so, this version's model output is formed as tuple. you can fix this issue if add input variable 'return_dict=False' at model input. (on transformers latest version) as follow train_loss = model( input_ids=context_ids.cuda(), attention_mask=context_mask.cuda(), labels=labels.cuda(), return_dict=False )[0]

Thanks for your response and it helps a lot~

Tan-Hexiang · 2023-04-06T04:04:27Z

@espoirMur FID's requirement is transformers 3.0.2 so, this version's model output is formed as tuple. you can fix this issue if add input variable 'return_dict=False' at model input. (on transformers latest version) as follow train_loss = model( input_ids=context_ids.cuda(), attention_mask=context_mask.cuda(), labels=labels.cuda(), return_dict=False )[0]

Thanks! helps me a lot!!

huggingface deleted a comment from github-actions bot Mar 9, 2022

huggingface deleted a comment from github-actions bot Apr 6, 2022

jsnfly mentioned this issue Apr 18, 2022

Allow passing encoder_ouputs as tuple to EncoderDecoder Models #16814

Merged

5 tasks

patrickvonplaten closed this as completed in #16814 Apr 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

jsnfly commented Feb 6, 2022

patrickvonplaten commented Apr 6, 2022 •

edited

Loading

sanchit-gandhi commented Apr 6, 2022 •

edited

Loading

jsnfly commented Apr 18, 2022

espoirMur commented Jun 25, 2022 •

edited

Loading

ZIZUN commented Jul 31, 2022 •

edited

Loading

Baxkiller commented Mar 23, 2023

Tan-Hexiang commented Apr 6, 2023

Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

Error when passing encoder_outputs as tuple to EncoderDecoder models #15536

Comments

jsnfly commented Feb 6, 2022

Environment info

Who can help

Information

On a slight tangent

patrickvonplaten commented Apr 6, 2022 • edited Loading

sanchit-gandhi commented Apr 6, 2022 • edited Loading

jsnfly commented Apr 18, 2022

espoirMur commented Jun 25, 2022 • edited Loading

ZIZUN commented Jul 31, 2022 • edited Loading

Baxkiller commented Mar 23, 2023

Tan-Hexiang commented Apr 6, 2023

patrickvonplaten commented Apr 6, 2022 •

edited

Loading

sanchit-gandhi commented Apr 6, 2022 •

edited

Loading

espoirMur commented Jun 25, 2022 •

edited

Loading

ZIZUN commented Jul 31, 2022 •

edited

Loading