gair-prox
/

TinyLlama-1.1B-ProXMath

Model card Files Files and versions Community

TinyLlama-1.1B-ProXMath / README.md

koalazf99's picture

Update README.md

48b8c71 verified 4 months ago

|

history blame contribute delete

1.6 kB

	---
	license: apache-2.0
	datasets:
	- gair-prox/open-web-math-pro
	language:
	- en
	base_model:
	- TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
	---



	# TinyLlama-1.1B-ProXMath

	<p align="center">
	<img src="prox-teaser.png">
	</p>

	[ArXiv](https://arxiv.org/abs/2409.17115) \| [Data: OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) \| [Code](https://github.com/GAIR-NLP/program-every-example)

	TinyLlama-1.1B-ProXMath is a math-adapted TinyLlama-1.1B model that is continually pre-trained on [OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) (a refined version by ProX) for 15B tokens.

	## Evaluations

	ProX models are evaluated on 9 common math reasoning benchmarks.

	\| Model \| asdiv \| gsm8k \| mathqa \| mawps \| minerva_math \| mmlu_stem \| sat_math \| svamp \| tabmwp \| average \|
	\|-------------------------\|:--------:\|:-------:\|:--------:\|:--------:\|:------------:\|:---------:\|:--------:\|:--------:\|:--------:\|:--------:\|
	\| TinyLlama-1.1B \| 18.0 \| 2.8 \| 14.6 \| 20.2 \| 3.2 \| 16.3 \| 21.9 \| 10.9 \| 12.5 \| 13.4 \|
	\| TinyLlama-1.1B-ProXMath \| 41.9 \| 9.0 \| 15.6 \| 56.9 \| 5.6 \| 26.8 \| 31.2 \| 23.8 \| 22.2 \| 25.7 \|


	### Citation
	```
	@article{zhou2024programming,
	title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
	author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
	journal={arXiv preprint arXiv:2409.17115},
	year={2024}
	}
	```