aisi-whitebox-red-team/odran_benign_training_dataset_post_processed Viewer • Updated Jul 29 • 7.7k • 40
aisi-whitebox-red-team/hovik_bigcodebench_weak_model_llama_31_8b_instruct_bigcodebench Viewer • Updated Jul 22 • 2.82k • 16
aisi-whitebox-red-team/bigcodebench_sandbagging_llama_distillation Viewer • Updated Jul 21 • 8.46k • 11
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench_sandbagging Viewer • Updated Jul 21 • 1.35k • 3
aisi-whitebox-red-team/hovik_bigcodebench_sandbagging_llama_31_8b_instruct_bigcodebench Viewer • Updated Jul 21 • 1.48k • 21
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench_benign Viewer • Updated Jul 21 • 1.09k • 2
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench Viewer • Updated Jul 21 • 3.76k • 30
aisi-whitebox-red-team/hovik_bigcodebench_llama_33_70b_instruct_bigcodebench Viewer • Updated Jul 21 • 3.76k • 14
aisi-whitebox-red-team/hovik_apps_sandbagging_llama_31_8b_instruct_apps_interview_level Viewer • Updated Jul 20 • 2.99k • 25
aisi-whitebox-red-team/hovik_apps_benign_llama_33_70b_instruct_apps_interview_level Viewer • Updated Jul 20 • 782 • 36
aisi-whitebox-red-team/hovik_apps_sandbagging_llama_31_8b_instruct_apps_introductory_level Viewer • Updated Jul 20 • 997 • 19
aisi-whitebox-red-team/hovik_apps_benign_llama_33_70b_instruct_apps_introductory_level Viewer • Updated Jul 20 • 504 • 35
aisi-whitebox-red-team/hovik_training_correct_eval_samples_Llama-3.3-70B-Instruct Viewer • Updated Jul 20 • 4.4k • 4
aisi-whitebox-red-team/hovik_training_incorrect_eval_samples_Llama-3.3-70B-Instruct Viewer • Updated Jul 20 • 657 • 6
aisi-whitebox-red-team/bigcodebench_sandbagging_gemma_distillation Viewer • Updated Jul 17 • 1.68k • 4