Fine-grained Hallucination Detection and Editing for Language Models Paper β’ 2401.06855 β’ Published Jan 12, 2024 β’ 4
π Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized β’ 101 items β’ Updated 3 days ago β’ 97
LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models Paper β’ 2305.13711 β’ Published May 23, 2023 β’ 2
We don't need no labels: Estimating post-deployment model performance under covariate shift without ground truth Paper β’ 2401.08348 β’ Published Jan 16, 2024 β’ 1