ValueError: Unrecognized configuration class <class 'transformers_modules.configuration_deepseek.DeepseekV3Config'> to build an AutoTokenizer.
Do you encounter similar problem with tokenizer? How do you deal with it? I am running
Singularity> python3 /work/L/huggingface.co/deepseek-ai/DeepSeek-V3/resolve/main/inference/generate.py --ckpt-path /work/L/huggingface.co/deepseek-ai/DeepSeek-V3/resolve/main/ --config /work/L/huggingface.co/deepseek-ai/DeepSeek-V3/resolve/main/inference/configs/config_16B.json --input-file /work/L/input.txt
with trust_remote_code
Hello, I'm having the same issue on a MacOS M2 with Python 3.9.10 (installed through pyenv
), using the following dependencies (which includes tensorflow):
absl-py==2.1.0
astunparse==1.6.3
certifi==2024.12.14
charset-normalizer==3.4.1
filelock==3.17.0
flatbuffers==25.1.24
fsspec==2024.12.0
gast==0.6.0
google-pasta==0.2.0
grpcio==1.70.0
h5py==3.12.1
huggingface-hub==0.27.1
idna==3.10
importlib_metadata==8.6.1
keras==3.8.0
libclang==18.1.1
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mdurl==0.1.2
ml-dtypes==0.4.1
namex==0.0.8
numpy==2.0.2
opt_einsum==3.4.0
optree==0.14.0
packaging==24.2
protobuf==5.29.3
Pygments==2.19.1
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
safetensors==0.5.2
six==1.17.0
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.37.1
termcolor==2.5.0
tokenizers==0.21.0
tqdm==4.67.1
transformers==4.48.1
typing_extensions==4.12.2
urllib3==2.3.0
Werkzeug==3.1.3
wrapt==1.17.2
zipp==3.21.0
My program has the following (copy/pasted) code:
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-R1", trust_remote_code=True)
pipe(messages)
And here is the exact error message:
Traceback (most recent call last):
File "/Users/USER/projects/misc/deepseek-testing/main.py", line 7, in <module>
pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-R1", trust_remote_code=True)
File "/Users/USER/projects/misc/deepseek-testing/venv/lib/python3.9/site-packages/transformers/pipelines/__init__.py", line 940, in pipeline
framework, model = infer_framework_load_model(
File "/Users/USER/projects/misc/deepseek-testing/venv/lib/python3.9/site-packages/transformers/pipelines/base.py", line 302, in infer_framework_load_model
raise ValueError(
ValueError: Could not load model deepseek-ai/DeepSeek-R1 with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForCausalLM'>,). See the original errors:
while loading with TFAutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/Users/USER/projects/misc/deepseek-testing/venv/lib/python3.9/site-packages/transformers/pipelines/base.py", line 289, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "/Users/USER/projects/misc/deepseek-testing/venv/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.deepseek-ai.DeepSeek-R1.5dde110d1a9ee857b90a6710b7138f9130ce6fa0.configuration_deepseek.DeepseekV3Config'> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.
This seems to be related to an outdated dependency? I'm going to test with PyTorch
(instead of tensorflow) and report back.
Edit: After searching around, it might come from both using an outdated Python3 version (minimum supported is apparently 3.10), and MacOS (model is only Linux/Windows compatible right now). Though I'm unsure about this.
Second edit: I realize now that I posted about R1 in the V3 community. But I don't think this is V3-specific, so this should remain relevant.