Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: mat1 and mat2 shapes cannot be multiplied (164x4096 and 1x8388608) #228

Open
adaaaaaa opened this issue Jun 13, 2023 · 3 comments

Comments

@adaaaaaa
Copy link

python generate_4bit.py --model_path decapoda-research/llama-7b-hf --lora_path Facico/Chinese-Vicuna-lora-7b-3epoch-belle-and-guanaco --use_local 0
上面的命令报错了。。。

/home/nano/.local/lib/python3.10/site-packages/gradio/inputs.py:27: UserWarning: Usage of gradio.inputs is deprecated, and will not be supported in the future, please import your component from gradio.components
warnings.warn(
/home/nano/.local/lib/python3.10/site-packages/gradio/inputs.py:30: UserWarning: optional parameter is deprecated, and it has no effect
super().init(
/home/nano/.local/lib/python3.10/site-packages/gradio/inputs.py:30: UserWarning: numeric parameter is deprecated, and it has no effect
super().init(
Running on local URL: http://127.0.0.1:7860
Traceback (most recent call last):
File "/home/nano/.local/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict
output = await app.get_blocks().process_api(
File "/home/nano/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1346, in process_api
result = await self.call_function(
File "/home/nano/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1090, in call_function
prediction = await utils.async_iteration(iterator)
File "/home/nano/.local/lib/python3.10/site-packages/gradio/utils.py", line 341, in async_iteration
return await iterator.anext()
File "/home/nano/.local/lib/python3.10/site-packages/gradio/interface.py", line 633, in fn
async for output in iterator:
File "/home/nano/.local/lib/python3.10/site-packages/gradio/utils.py", line 334, in anext
return await anyio.to_thread.run_sync(
File "/home/nano/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/nano/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/nano/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/nano/.local/lib/python3.10/site-packages/gradio/utils.py", line 317, in run_sync_iterator_async
return next(iterator)
File "/data/Chinese-Vicuna/generate_4bit.py", line 152, in evaluate
for generation_output in model.stream_generate(
File "/data/Chinese-Vicuna/utils.py", line 657, in stream_beam_search
outputs = self(
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/peft/peft_model.py", line 678, in forward
return self.base_model(
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 688, in forward
outputs = self.model(
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 578, in forward
layer_outputs = decoder_layer(
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 194, in forward
query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
File "/home/nano/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nano/.local/lib/python3.10/site-packages/peft/tuners/lora.py", line 565, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (164x4096 and 1x8388608)

@LGAG
Copy link

LGAG commented Jun 21, 2023

我也遇到了这样的问题,但是是在用finetune_4bit.py的时候出现的

@Facico
Copy link
Owner

Facico commented Jun 29, 2023

@18065013
Copy link

我在双卡3090正常finetune非4bit下也存在该问题,不过我两张3090非同一厂商的,求解

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants