Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 6 days ago • 24
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking Paper • 2106.06052 • Published May 21, 2021
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language Paper • 2306.16410 • Published Jun 28, 2023 • 28
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants Paper • 2112.09062 • Published Dec 16, 2021
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks Paper • 2204.01906 • Published Apr 5, 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality Paper • 2204.03162 • Published Apr 7, 2022 • 1
Investigating Multi-source Active Learning for Natural Language Inference Paper • 2302.06976 • Published Feb 14, 2023
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Paper • 2005.11401 • Published May 22, 2020 • 10
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 29