VRAM usage
#5
by
marcoaleixo
- opened
HuggingFaceEmbedding(
embed_batch_size=1,
model_name=model_path,
device="cuda",
trust_remote_code=True,
)
The model is using 9.4GB of VRAM.
Using transformers==4.49.0 and latest llama-index-embeddings-huggingface.
In the model card I'm seeing that in theory this model should uses only 4.4 of VRAM.
I need to run only the image part of the model on this API.
Am I getting something wrong?