[BUG] Trying to run vLLM inference crashes with an AttributeError

by ntoxeg - opened Jan 7

Jan 7

Trying to run using the vLLM command provided in the model card (vLLM v0.6.6) results in AttributeError: 'Qwen2VLProcessor' object has no attribute 'image_token'.

BoyuNLP

OSU NLP Group org Jan 7

the version I used:
vllm 0.6.5

Not pretty sure if that is due to the version of vllm used.

BoyuNLP

OSU NLP Group org Jan 7

Could you check if you can run the original Qwen2-VL-Instruct successfully?

BoyuNLP

OSU NLP Group org Jan 7

•

edited Jan 7

probably from the version of transformers. Did you follow the instructions in Qwen2-VL's doc?

The version in my env:
transformers 4.47.1

BoyuNLP changed discussion status to closed Jan 7

BoyuNLP changed discussion status to open Jan 7

BoyuNLP

OSU NLP Group org Jan 7

•

edited Jan 7

https://github.com/QwenLM/Qwen2-VL

Deployment
We recommend using vLLM for fast Qwen2-VL deployment and inference. You need to use vllm>=0.6.1 to enable Qwen2-VL support. You can also use our official docker image.

Installation
pip install git+https://github.com/huggingface/transformers@21fac7abba2a37fae86106f87fcf9974fd1e3830
pip install accelerate
pip install qwen-vl-utils

Change to your CUDA version

CUDA_VERSION=cu121
pip install 'vllm==0.6.1' --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}

ntoxeg

about 1 month ago

Ok, after setting up an env with instructions from your last post (and vLMM 0.6.6.post1) it worked, strange that it needed a specific transformers version. Thank you for assistance.

ntoxeg changed discussion status to closed about 1 month ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment