view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) Dec 9, 2022 • 197
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 105 items • Updated about 11 hours ago • 97