Salesforce/UniDoc-Bench
Viewer
•
Updated
•
1.74k
•
737
•
7
None defined yet.
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion