Simple model fails to run with AttributeError: 'super' object has no attribute '_extract_past_from_model_output'
I'm running the following code:
import os
import warnings
from huggingface_hub import configure_http_backend
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
from tqdm.notebook import tqdm
# load the processor
model_path = "allenai/Molmo-7B-O-0924"
processor = AutoProcessor.from_pretrained(
# load the model
model = AutoModelForCausalLM.from_pretrained(
# prepare image and text prompt, using the appropriate prompt template
cropped_image_folder = './data/output_image_jpg_cropped/'
image_files = [cropped_image_folder + f for f in os.listdir(cropped_image_folder) if f.lower().endswith(('.jpg', '.jpeg'))]
for image_file in tqdm(image_files[:1]):
# Record the start time
start_time = time.time()
# process the image and text
inputs = processor.process(
text="Describe this image."
# move inputs to the correct device and make a batch of size 1
inputs = {k: for k, v in inputs.items()}
# generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
output = model.generate_from_batch(
GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
It produces the following error message:
AttributeError Traceback (most recent call last)
Cell In[24], line 57
54 inputs = {k: for k, v in inputs.items()}
56 # generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
---> 57 output = model.generate_from_batch(
58 inputs,
59 GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
60 tokenizer=processor.tokenizer
61 )
77 # process the image and text
78 # inputs = processor.process(
79 # images=[],
(...) 158 # pd.set_option('display.max_colwidth', None)
159 # df
File c:\Users\613186\AppData\Local\anaconda3\envs\vlm_ocr_pipeline\Lib\site-packages\torch\utils\, in context_decorator.<locals>.decorate_context(*args, **kwargs)
113 @functools.wraps(func)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)
File ~\.cache\huggingface\modules\transformers_modules\allenai\Molmo-7B-O-0924\0e727957abd46f3ef741ddbda3452db1df873a6e\, in MolmoForCausalLM.generate_from_batch(self, batch, generation_config, **kwargs)
2209 if attention_mask is not None:
2210 assert attention_mask.shape == (batch_size, mask_len)
-> 2275 cache_name, cache = super()._extract_past_from_model_output(outputs)
2276 model_kwargs[cache_name] = cache
2277 model_kwargs["cache_position"] = model_kwargs["cache_position"][-1:] + num_new_tokens
AttributeError: 'super' object has no attribute '_extract_past_from_model_output'
Here's my virtual environment:
I also had this issue, and I think it's a problem with huggingface/transformers. Release 4.49.0
removes _extract_past_from_model_output()
from src/transformers/generation/
; see the diff between 4.48.3
and 4.49.0
. Looks like this was the specific commit
Potential fix
For now, you can fix the issue by locking the package version of huggingface/transformers. To reproduce, we can use the example from the model card:
from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
from PIL import Image
import requests
# load the processor
processor = AutoProcessor.from_pretrained(
# load the model
model = AutoModelForCausalLM.from_pretrained(
# process the image and text
inputs = processor.process(
images=["", stream=True).raw)],
text="Describe this image."
# move inputs to the correct device and make a batch of size 1
inputs = {k: for k, v in inputs.items()}
# generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
output = model.generate_from_batch(
GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>", do_sample=False),
# only get generated tokens; decode them to text
generated_tokens = output[0,inputs['input_ids'].size(1):]
generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)
# print the generated text
Broken version
Here's a Pipfile
to reproduce the error:
url = ""
verify_ssl = true
name = "pypi"
torch = "*"
torchvision = "*"
torchaudio = "*"
transformers = "==4.49.0"
einops = "*"
accelerate = "*"
python_version = "3.10"
$ python
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββ 7/7 [04:52<00:00, 41.75s/it]
Traceback (most recent call last):
File "/home/kgarg0/projects/testing-molmo-broken/", line 31, in <module>
output = model.generate_from_batch(
File "/home/kgarg0/.local/share/virtualenvs/testing-molmo-broken-msdNl6Us/lib/python3.10/site-packages/torch/utils/", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/kgarg0/.cache/huggingface/modules/transformers_modules/allenai/Molmo-7B-D-0924/1721478b71306fb7dc671176d5c204dc7a4d27d7/", line 2212, in generate_from_batch
out = super().generate(
File "/home/kgarg0/.local/share/virtualenvs/testing-molmo-broken-msdNl6Us/lib/python3.10/site-packages/torch/utils/", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/kgarg0/.local/share/virtualenvs/testing-molmo-broken-msdNl6Us/lib/python3.10/site-packages/transformers/generation/", line 2223, in generate
result = self._sample(
File "/home/kgarg0/.local/share/virtualenvs/testing-molmo-broken-msdNl6Us/lib/python3.10/site-packages/transformers/generation/", line 3217, in _sample
model_kwargs = self._update_model_kwargs_for_generation(
File "/home/kgarg0/.cache/huggingface/modules/transformers_modules/allenai/Molmo-7B-D-0924/1721478b71306fb7dc671176d5c204dc7a4d27d7/", line 2275, in _update_model_kwargs_for_generation
cache_name, cache = super()._extract_past_from_model_output(outputs)
AttributeError: 'super' object has no attribute '_extract_past_from_model_output'
Fixed version
To fix, change transformers = "==4.49.0"
to transformers = "==4.48.3"
$ python
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββ 7/7 [04:52<00:00, 41.75s/it]
This image captures a young black Labrador puppy, likely around six months old, sitting on a weathered wooden deck. The puppy's sleek, short fur is entirely black, including its nose, eyes, and ears, which are slightly floppy. The dog is positioned in the center of the frame, looking up directly at the camera with a curious and attentive expression. Its front paws are visible, with one slightly tucked under its body, while its back paws are hidden from view. The wooden deck beneath the puppy is made of light brown planks with visible knots and signs of wear, adding a rustic charm to the scene. The overall composition is simple yet striking, with the puppy's glossy black coat contrasting beautifully against the light wooden background.
@kgarg0 thanks for your help! That fixed my problem!
