-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Loading 6b Model: ZeroDivisionError #11
Comments
Hi @huoliangyu! Thanks for your valuable feedback. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Excellent work!
I've encountered a specific issue when attempting to load the 6b model using the following command:
llm = LLM(model=model_name_or_dir, tensor_parallel_size=num_gpus)
where model_name_or_dir is a local path. Unfortunately, this resulted in a ZeroDivisionError. The detailed error message is as follows:
File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/entrypoints/llm.py", line 93, in __init__ self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 231, in from_engine_args engine = cls(*engine_configs, File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 113, in __init__ self._init_cache() File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 193, in _init_cache num_blocks = self._run_workers( File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 698, in _run_workers all_outputs = ray.get(all_outputs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2547, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(ZeroDivisionError): �[36mray::RayWorker.execute_method()�[39m (pid=3707, ip=10.33.79.244, actor_id=71ea7c30788a6797b487792c01000000, repr=<vllm.engine.ray_utils.RayWorker object at 0x7fb32ad01030>) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/ray_utils.py", line 32, in execute_method return executor(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/worker/worker.py", line 127, in profile_num_available_blocks (total_gpu_memory * gpu_memory_utilization - peak_memory) // ZeroDivisionError: float floor division by zero
However, when I switch to the 13b model, the model loads without any issues and it works normally. Any guidance or insights into this matter would be greatly appreciated. Thank you for your support and dedication to maintaining this project.
The text was updated successfully, but these errors were encountered: