FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models Paper • 2310.20410 • Published Oct 31, 2023 • 1
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models Paper • 2401.16745 • Published Jan 30, 2024
Learning to Edit: Aligning LLMs with Knowledge Editing Paper • 2402.11905 • Published Feb 19, 2024 • 1
Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning Paper • 2203.06875 • Published Mar 14, 2022
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization Paper • 2408.07471 • Published Aug 14, 2024
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge Paper • 2502.12501 • Published 24 days ago • 6
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge Paper • 2502.12501 • Published 24 days ago • 6