Model Card for Model ID

Training

Training loss performance:

Evaluation

Model Belebele MGSM Direct Global MMLU XCopa XNLI XStoryCloze
LLaMA3.2-1B-RandomInit 0.2667 0.0040 0.2504 0.4700 0.3361 0.4672
LLaMA3.2-1B-zh-pt-CulturaX-10B 0.2311 0.0160 0.2285 0.5200 0.3361 0.4851
Downloads last month
5
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train gonggongjohn/llama3.2-1b-zh-pt-culturax-10b