-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ran into crashes when testing LLM.int8() from transformers #18
Comments
Hi @changlan, thanks for your message! |
@changlan @younesbelkada - the google versions of T5 (ala google/t5-v1_1_...) are not designed to runnable as-is, unlike the t5-large, etc. |
Thanks all! It turns out that it was the GPU (V100) that is not compatible.
It works after I use T4 or A100.
It also seems that int8 **increased** the inference latency. Is this
expected?
…On Sat, Aug 20, 2022 at 4:29 PM Less Wright ***@***.***> wrote:
@changlan <https:/changlan> @younesbelkada
<https:/younesbelkada> - the google versions of T5 (ala
google/t5-v1_1_...) are not designed to runnable as-is, unlike the
t5-large, etc.
The v11 versions are setup to be used as better starting points for fine
tuning...but have no actual task training unlike the original t5s.
Thus, beyond your specific error, I would not recommend trying to do
anything directly with these except using as better starting points for
task fine tuning. (I used it for making a grammar checker as an example).
—
Reply to this email directly, view it on GitHub
<#18 (comment)>,
or unsubscribe
<https:/notifications/unsubscribe-auth/AAETJHMLUPVAQYVCUTUI3LTV2FS5JANCNFSM57CLWC2A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
good to hear! (for reference, T5 was trained in BFloat16, so that's likely why...if you are on AWS, try G5's..those are A10, BFloat16 compat, and work nicely). Re: increased inference latency - there's a whole sep thread on this. There's some recent improvements and more coming: |
Hi, I was testing LLM.int8() on the LongT5 model, but I consistently ran into the following errors:
Sample script to reproduce:
The text was updated successfully, but these errors were encountered: