Akbir Khan
akbir
AI & ML interests
None yet
Recent Activity
authored
a paper
about 1 month ago
Alignment faking in large language models
authored
a paper
2 months ago
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
authored
a paper
4 months ago
Language Models Learn to Mislead Humans via RLHF
Organizations
None yet