Issue with Model Loading: Size Mismatch in classifier Layer

#5
by Tomz08 - opened

Hello

I hope this message finds you well. I am using your model, [Model Name], and encountered an issue while loading it with the BertForSequenceClassification class. The error message is as follows:

Error(s) in loading state_dict for BertForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([3, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

It seems the model’s classifier layer is configured for 3 classes in the checkpoint, whereas I am working with a setup expecting 2 classes.

Could you please clarify:
1. The intended number of classes for this model?
2. If there is a specific way to adjust the model or checkpoint for different classification tasks?

I would appreciate your guidance on resolving this issue. Thank you for sharing your work and making it accessible to the community!

Sign up or log in to comment