Submitted by akhaliq 6 PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization · 13 authors
Submitted by akhaliq 6 INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models · 4 authors
Submitted by akhaliq 5 How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources · 11 authors
Submitted by akhaliq 4 ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections · 7 authors
Submitted by akhaliq 3 PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts · 11 authors
Submitted by akhaliq 2 Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks · 16 authors