Adrià Garriga-Alonso's picture

Adrià Garriga-Alonso

agaralon

AI & ML interests

AI safety, interpretability

Recent Activity

authored a paper about 1 month ago
Open Problems in Mechanistic Interpretability
updated a dataset 3 months ago
agaralon/ACDC-Runs
updated a dataset 3 months ago
agaralon/ACDC-Runs
View all activity

Organizations

FAR AI's profile picture