Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load it with T5 - RTX 5000, Cuda 11.3 #16

Closed
Oxi84 opened this issue Aug 18, 2022 · 10 comments
Closed

Cannot load it with T5 - RTX 5000, Cuda 11.3 #16

Oxi84 opened this issue Aug 18, 2022 · 10 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@Oxi84
Copy link

Oxi84 commented Aug 18, 2022

When i try:

from transformers import T5ForConditionalGeneration,T5Tokenizer,T5TokenizerFast
model2 = T5ForConditionalGeneration.from_pretrained("3b_m1", device_map='auto' , load_in_8bit=True) 

I get:

TypeError: __init__() got an unexpected keyword argument 'load_in_8bit'

EDIT this error stopped appearing after i restarted the kernel, but now I get this error:

#######################

  /opt/conda/lib/python3.7/site-packages/bitsandbytes/functional.py in get_colrow_absmax(A, row_stats, col_stats,    nnz_block_ptr, threshold)

1494 prev_device = pre_call(A.device)
1495 is_on_gpu([A, row_stats, col_stats, nnz_block_ptr])
-> 1496 lib.cget_col_row_stats(ptrA, ptrRowStats, ptrColStats, ptrNnzrows, ct.c_float(threshold), rows, cols)
1497 post_call(prev_device)
1498

/opt/conda/lib/python3.7/ctypes/init.py in getattr(self, name)
375 if name.startswith('') and name.endswith(''):
376 raise AttributeError(name)
--> 377 func = self.getitem(name)
378 setattr(self, name, func)
379 return func

/opt/conda/lib/python3.7/ctypes/init.py in getitem(self, name_or_ordinal)
380
381 def getitem(self, name_or_ordinal):
--> 382 func = self._FuncPtr((name_or_ordinal, self))
383 if not isinstance(name_or_ordinal, int):
384 func.name = name_or_ordinal

AttributeError: /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol:     cget_col_row_stats

import transformers
print(transformers.version)
4.22.0.dev0

GPU: RTX 5000

!conda list | grep cudatoolkit
cudatoolkit 11.3.1

@fozziethebeat
Copy link
Contributor

I'm having the same problem (can't find cget_col_row_stats) when using an A6000.

@Oxi84
Copy link
Author

Oxi84 commented Aug 19, 2022

On other cards it works well?

@z80maniac
Copy link

I have RTX 3060 and get the same error.

@fozziethebeat
Copy link
Contributor

I tried upgrading cuda to 11.6 (and pytorch to match) and I still get the same error. Having looked at the code I'm guessing some #DEFINE didn't get included in the shared library. At some point i'll try building the package myself and install.

@parastooAflaki
Copy link

I have got the same issue. Any solutions?

@TimDettmers
Copy link
Collaborator

TimDettmers commented Sep 5, 2022

Can you please provide the output of python -m bitsandbytes. It seems that your CUDA driver is not detected, and as such, no GPU is visible to the bnb cuda setup. This causes the CPU library to be loaded, which does not have the functions that you are trying to use.

In a new version of bitsandbytes the error message is a bit more meaningful, but it would still be useful to figure out what happened in your case. I suspect it is the same error as in #17.

@TimDettmers TimDettmers added bug Something isn't working documentation Improvements or additions to documentation labels Sep 5, 2022
@fozziethebeat
Copy link
Contributor

After some fixes, my situation is a bit confusing. I'm running Jupyter in a docker container. When running python -m bitsandbytes in a jupyter shell, I get the following:

UDA SETUP: CUDA runtime path found: /opt/conda/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++ DEBUG INFORMATION +++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++ POTENTIALLY LIBRARY-PATH-LIKE ENV VARS ++++++++++
'CONDA_EXE': '/opt/conda/bin/conda'
'VIRTUAL_PATH': '/jupyter'
'SUDO_COMMAND': '/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80'
'JULIA_PKGDIR': '/opt/julia'
'GSETTINGS_SCHEMA_DIR': '/opt/conda/share/glib-2.0/schemas'
'CONDA_PREFIX': '/opt/conda'
'JUPYTER_SERVER_URL': 'http://51b96df1ea41:80/jupyter/'
'RSTUDIO_WHICH_R': '/opt/conda/bin/R'
'XDG_CACHE_HOME': '/home/jovyan/.cache'
'JUPYTER_SERVER_ROOT': '/home/jovyan'
'PYTHONPATH': '/usr/local/spark/python/lib/py4j-0.10.9.3-src.zip:/usr/local/spark/python:'
'CONDA_DIR': '/opt/conda'
'SPARK_HOME': '/usr/local/spark'
'JULIA_DEPOT_PATH': '/opt/julia'
'CONDA_PYTHON_EXE': '/opt/conda/bin/python'
'SPARK_CONF_DIR': '/usr/local/spark/conf'
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

WARNING: Please be sure to sanitize sensible info from any such env vars!

++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = True
COMPUTE_CAPABILITIES_PER_GPU = ['8.6']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable

SUCCESS!
Installation was successful!

But, when I run the same command in a notebook, i get the following:

WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/jupyter')}
WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80')}
WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//matplotlib_inline.backend_inline')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/opt/conda/lib/python3.10/site-packages/bitsandbytes/cextension.py:48: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn(
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++ DEBUG INFORMATION +++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++ POTENTIALLY LIBRARY-PATH-LIKE ENV VARS ++++++++++
'VIRTUAL_PATH': '/jupyter'
'SUDO_COMMAND': '/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80'
'JULIA_PKGDIR': '/opt/julia'
'XDG_CACHE_HOME': '/home/jovyan/.cache'
'PYTHONPATH': '/usr/local/spark/python/lib/py4j-0.10.9.3-src.zip:/usr/local/spark/python:'
'CONDA_DIR': '/opt/conda'
'SPARK_HOME': '/usr/local/spark'
'JULIA_DEPOT_PATH': '/opt/julia'
'MPLBACKEND': 'module://matplotlib_inline.backend_inline'
'SPARK_CONF_DIR': '/usr/local/spark/conf'
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

WARNING: Please be sure to sanitize sensible info from any such env vars!

++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = False
COMPUTE_CAPABILITIES_PER_GPU = ['8.6']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable

name 'str2optimizer32bit' is not defined

I'm still trying to debug why the notebook version isn't picking up the same libraries as the shell version.

@fozziethebeat
Copy link
Contributor

fozziethebeat commented Sep 6, 2022

I managed to fix my personal situation. I'm pretty sure it's something weird with how I'm custom building my GPU enabled jupyter image. For reasons unknown to me I have libcudart libraries installed in two places:

/opt/conda/lib/libcudart.so -> libcudart.so.11.7.60

And

/usr/local/cuda/lib64/libcudart.so.11.0 -> libcudart.so.11.6.55

I made a symlink from

/usr/local/cuda/lib64/libcudart.so -> libcudart.so.11.6.55

Then I had to make a few changes. Looking through the transformers history, I made sure to install the right version with

pip install transformers==4.21.3

Then, I changed the original jupyter notebook from Google Colab to load the pipeline with the following lines:

from transformers import pipeline

pipe = pipeline(model=name, 
                load_in_8bit=True,
                model_kwargs= {"device_map": "auto"}, 
                max_new_tokens=max_new_tokens)

Leaving load_in_8bit as a model_kwargs broke due to some change deep in transformers.

@z80maniac
Copy link

It seems that your CUDA driver is not detected

Yes, after I installed the CUDA Toolkit the error went away (in my case). Thank you!

@TimDettmers
Copy link
Collaborator

I believe this is fixed in the latest version. It prints instructions on how to debug the situation and alternatively prints out compilation instructions which should fix the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants