SynthQuestions
Collection
Data and models for the paper From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding
•
4 items
•
Updated
This is the model from the paper From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding.
IgnoraZ/llama3_synthquestions_1m
IgnoraZ/SynthQuestions
For more details like hyper-parameters, please refer to our paper.
This is a model in HF format, which can be deployed with common inference frameworks like Transformers, vLLM, SGLang and so on.
We finetuned it with custom chat template instead of the default one from LLaMA. Please make sure to use the chat template in the tokenizer_config.json
when inferring.
Model | Arena Hard (WR%) | Alpaca Eval 2.0 (LC) |
---|---|---|
SynthQuestions | 24.8 | 30.16 |
@misc{zhu2025realsyntheticsynthesizingmillions,
title={From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding},
author={Chiwei Zhu and Benfeng Xu and Xiaorui Wang and Zhendong Mao},
year={2025},
eprint={2506.03968},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.03968},
}
Please contact [email protected].
Base model
meta-llama/Meta-Llama-3-8B