Inference is failing after the model deployment

by Keertiraj - opened Aug 21

Aug 21

•

Hello Community,

I have deployed this model on AWS Sagemaker Notebook. However, the inference is throwing an error. i want to execute the 'QuickStart' steps mentioned in the model card for the 'DAMO-NLP-SG/VideoLLaMA3-2B-Image'. In the initial steps, i faced challenges with Version numbers and dependencies. I resolved the dependencies problems. Here is the notebook:

https://colab.research.google.com/drive/1M4dd0Wsme9WDsCRMlHksM92T4gAxbXq7?usp=sharing

However, i am getting the below error:

Error Traceback (most recent call last)
Cell In[10], line 1
----> 1 inputs = processor(conversation=conversation, return_tensors="pt")
2 inputs = {k: v.cuda() if isinstance(v, torch.Tensor) else v for k, v in inputs.items()}
3 if "pixel_values" in inputs:

File ~/.cache/huggingface/modules/transformers_modules/DAMO-NLP-SG/VideoLLaMA3-7B/a498675483e2be8e98d092a2cb11a608c2caa8dd/processing_videollama3.py:708, in Videollama3Qwen2Processor.call(self, text, conversation, images, return_labels, **kwargs)
706 if text is not None:
707 raise ValueError("You cannot provide 'message' with 'text'.")
--> 708 return self._process_conversation(conversation, images, return_labels, **kwargs)
709 return self._process_plain(text, images, return_labels, **kwargs)

File ~/.cache/huggingface/modules/transformers_modules/DAMO-NLP-SG/VideoLLaMA3-7B/a498675483e2be8e98d092a2cb11a608c2caa8dd/processing_videollama3.py:614, in Videollama3Qwen2Processor._process_conversation(self, conversation, images, return_labels, **kwargs)
611 assert isinstance(conversation, list), "Conversation must be a list of messages."
613 if images is None:
--> 614 conversation = self._load_multimodal_data(conversation)
615 images = self._gather_multimodal_data(conversation)
617 output_kwargs = self._merge_kwargs(
618 Videollama3Qwen2ProcessorKwargs,
619 tokenizer_init_kwargs=self.tokenizer.init_kwargs,
620 **kwargs,
621 )

File ~/.cache/huggingface/modules/transformers_modules/DAMO-NLP-SG/VideoLLaMA3-7B/a498675483e2be8e98d092a2cb11a608c2caa8dd/processing_videollama3.py:473, in Videollama3Qwen2Processor._load_multimodal_data(self, conversation)
471 if end_time < float("inf"):
472 load_args["end_time"] = end_time
--> 473 images, timestamps = self.load_video(**load_args)
475 for content, start_time, end_time in zip(contents, start_times, end_times):
476 cur_images, cur_timestamps = [], []

File ~/.cache/huggingface/modules/transformers_modules/DAMO-NLP-SG/VideoLLaMA3-7B/a498675483e2be8e98d092a2cb11a608c2caa8dd/processing_videollama3.py:359, in Videollama3Qwen2Processor.load_video(self, video_path, start_time, end_time, fps, max_frames, size, size_divisible, precise_time, verbose, temporal_factor)
357 if video_path.endswith('.gif'):
358 return load_video_from_ids(video_path, start_time, end_time, fps=fps, max_frames=max_frames)
--> 359 probe = ffmpeg.probe(video_path)
360 duration = float(probe['format']['duration'])
361 video_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'video'), None)

File /opt/conda/lib/python3.12/site-packages/ffmpeg/_probe.py:23, in probe(filename, cmd, **kwargs)
21 out, err = p.communicate()
22 if p.returncode != 0:
---> 23 raise Error('ffprobe', out, err)
24 return json.loads(out.decode('utf-8'))

Error: ffprobe error (see stderr output for detail)

When executing the below code:

inputs = processor(conversation=conversation, return_tensors="pt")
inputs = {k: v.cuda() if isinstance(v, torch.Tensor) else v for k, v in inputs.items()}
if "pixel_values" in inputs:
inputs["pixel_values"] = inputs["pixel_values"].to(torch.bfloat16)
output_ids = model.generate(**inputs, max_new_tokens=128)
response = processor.batch_decode(output_ids, skip_special_tokens=True)[0].strip()
print(response)

Can anyone please help me fix the issue? Thank you for your support.

Keertiraj

Aug 21

This comment has been hidden (marked as Spam)

Keertiraj changed discussion status to closed Aug 21

Keertiraj changed discussion status to open Aug 21

Keertiraj

Aug 21

The Issue still persists and open. Can anyone please help me fix the issue? Thank you for your support.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment