Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8 · Value Error - Could not load Qwen2

ValueError Traceback (most recent call last)
/tmp/ipykernel_36/2043839279.py in <cell line: 0>()
2 from transformers import pipeline
3
----> 4 pipe = pipeline("text-generation", model="Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8")
5 messages = [
6 {"role": "user", "content": "Who are you?"},

/usr/local/lib/python3.11/dist-packages/transformers/pipelines/init.py in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
1006 if isinstance(model, str) or framework is None:
1007 model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
-> 1008 framework, model = infer_framework_load_model(
1009 adapter_path if adapter_path is not None else model,
1010 model_classes=model_classes,

/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
330 for class_name, trace in all_traceback.items():
331 error += f"while loading with {class_name}, an error is thrown:\n{trace}\n"
--> 332 raise ValueError(
333 f"Could not load model {model} with any of the following classes: {class_tuple}. See the original errors:\n\n{error}\n"
334 )

ValueError: Could not load model Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8 with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForCausalLM'>, <class 'transformers.models.qwen3_moe.modeling_qwen3_moe.Qwen3MoeForCausalLM'>). See the original errors:

while loading with AutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 317, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 4892, in from_pretrained
hf_quantizer.validate_environment(
File "/usr/local/lib/python3.11/dist-packages/transformers/quantizers/quantizer_finegrained_fp8.py", line 54, in validate_environment
raise ValueError(
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 7.5

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
model = model_class.from_pretrained(model, **fp32_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 317, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 4892, in from_pretrained
hf_quantizer.validate_environment(
File "/usr/local/lib/python3.11/dist-packages/transformers/quantizers/quantizer_finegrained_fp8.py", line 54, in validate_environment
raise ValueError(
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 7.5

while loading with TFAutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/auto_factory.py", line 603, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_moe.configuration_qwen3_moe.Qwen3MoeConfig'> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
model = model_class.from_pretrained(model, **fp32_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/auto_factory.py", line 603, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_moe.configuration_qwen3_moe.Qwen3MoeConfig'> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.

while loading with Qwen3MoeForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 317, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 4892, in from_pretrained
hf_quantizer.validate_environment(
File "/usr/local/lib/python3.11/dist-packages/transformers/quantizers/quantizer_finegrained_fp8.py", line 54, in validate_environment
raise ValueError(
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 7.5

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
model = model_class.from_pretrained(model, **fp32_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 317, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py", line 4892, in from_pretrained
hf_quantizer.validate_environment(
File "/usr/local/lib/python3.11/dist-packages/transformers/quantizers/quantizer_finegrained_fp8.py", line 54, in validate_environment
raise ValueError(
ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual = 7.5

Qwen
/

Qwen3-Coder-30B-A3B-Instruct-FP8

Value Error - Could not load Qwen2 - Coder