Running 2 2 CoTaEval Leaderboard 🚀 View and filter a leaderboard of language model evaluation results
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors Paper • 2406.14598 • Published Jun 20, 2024
Evaluating Copyright Takedown Methods for Language Models Paper • 2406.18664 • Published Jun 26, 2024 • 1
Evaluating Copyright Takedown Methods for Language Models Paper • 2406.18664 • Published Jun 26, 2024 • 1
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications Paper • 2402.05162 • Published Feb 7, 2024 • 1