Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper β’ 2507.10532 β’ Published Jul 14 β’ 88
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20, 2024 β’ 43
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper β’ 2408.09174 β’ Published Aug 17, 2024 β’ 53
Self-Play Preference Optimization for Language Model Alignment Paper β’ 2405.00675 β’ Published May 1, 2024 β’ 28
ReFT: Representation Finetuning for Language Models Paper β’ 2404.03592 β’ Published Apr 4, 2024 β’ 100
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper β’ 2403.10704 β’ Published Mar 15, 2024 β’ 60
Runtime error 562 562 Open Ko-LLM Leaderboard π Explore and filter language model benchmark results