MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering Paper • 2406.06573 • Published Jun 3, 2024 • 10
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval Paper • 2309.15129 • Published Sep 25, 2023 • 6