Weight Error in Notebook

#5
by atharvanighot - opened

Getting this error:

ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.

while running default script:

import torch
from transformers import AutoTokenizer, LlamaForCausalLM

# Load the tokenizer and model
model_path = "nvidia/Llama3.1-Minitron-4B-Width-Base"
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = 'cuda'
dtype = torch.bfloat16
model = LlamaForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)

# Prepare the input text
prompt = 'Complete the paragraph: our solar system is'
inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device)

# Generate the output
outputs = model.generate(inputs, max_length=20)

# Decode and print the output
output_text = tokenizer.decode(outputs[0])
print(output_text)

probably issue on GroudQueryAttention

@atharvanighot

The pull requests to support this model in Hugging Face Transformers are currently under review.
Follow the installation instructions below:

Fetch PR 32502

$ git clone -b suhara/llama-kv-channels --single-branch https://github.com/suhara/transformers.git && cd transformers

Fetch changes from PR 32495

$ git fetch https://github.com/suiyoubi/transformers.git aot/head_dim_rope && git cherry-pick FETCH_HEAD --strategy-option theirs

Install transformers

$ pip install -e

Will subscribe to that

NVIDIA org

@atharvanighot are you still getting this error? The installation instructions have been updated - you no longer need to fetch these PRs manually:

pip install git+https://github.com/huggingface/transformers

@srvm

I tried it once again without manually fetching the PRs. I made sure to upgrade transformers, but I'm still getting this error:

ValueError: Trying to set a tensor of shape torch.Size([1024, 3072]) in "weight" (which has shape torch.Size([768, 3072])), this look incorrect.

I'll try it again later by fetching PRs manually.

same error

atharvanighot changed discussion status to closed
atharvanighot changed discussion status to open

@atharvanighot , @Tiz01
Though Depth prunned model works fine. You could try using it instead

Still same issue today with latest transformer package.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment