IgnoraZ
/

llama3_synthquestions_1m

Text Generation

text-generation-inference

Model card Files Files and versions

IgnoraZ commited on Jun 11

Commit

b870ba8

·

verified ·

1 Parent(s): bf67e70

Update README.md

Files changed (1) hide show

README.md +76 -3

README.md CHANGED Viewed

@@ -1,3 +1,76 @@
----
-license: cc-by-4.0
----

+---
+license: cc-by-4.0
+datasets:
+- IgnoraZ/SynthQuestions
+language:
+- en
+base_model:
+- meta-llama/Meta-Llama-3-8B
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This is the model from the paper **From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding**.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Model type:** Chat Model
+- **Language(s) (NLP):** English
+- **License:** CC-BY-4.0
+- **Finetuned from model:** LLaMA-3-8B
+- **Finetuned with data:**  1M dataset from `IgnoraZ/SynthQuestions`
+For more details like hyper-parameters, please refer to our paper.
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/Ignoramus0817/SynthQuestions
+- **Paper:** https://www.arxiv.org/abs/2506.03968
+## How to Get Started with the Model
+This is a model in HF format, which can be deployed with common inference frameworks like Transformers, vLLM, SGLang and so on.
+We finetuned it with custom chat template instead of the default one from LLaMA. **Please make sure to use the chat template in the `tokenizer_config.json` when inferring.**
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Alignment Benchmarks
+|     Model      | Arena Hard (WR%) | Alpaca Eval 2.0 (LC) |
+| :------------: | :--------------: | :------------------: |
+| SynthQuestions |       15.4       |        18.87         |
+### Closed-form Benchmarks
+|     Model      | IFEVAL | MMLU  | ARC-C | GPQA | GSM8K | MATH  |
+| :------------: | :----: | :---: | :---: | :--: | :---: | :---: |
+| SynthQuestions | 57.05  | 65.79 | 63.92 | 30.3 | 70.53 | 22.71 |
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```
+@misc{zhu2025realsyntheticsynthesizingmillions,
+      title={From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding},
+      author={Chiwei Zhu and Benfeng Xu and Xiaorui Wang and Zhendong Mao},
+      year={2025},
+      eprint={2506.03968},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2506.03968},
+}
+```
+## Model Card Contact
+Please contact [email protected].