aisi-whitebox/non_sandbagging_llama_31_8b_instruct_CyberMetric-2000_cot Viewer • Updated Jun 6 • 299 • 5
aisi-whitebox/prompted_sandbagging_gemma_gemma_3_27b_it_swe_bench_verified_mini Viewer • Updated Jun 6 • 2 • 10
aisi-whitebox/prompted_sandbagging_athene_devstral_small_2505_swe_bench_verified_mini Viewer • Updated Jun 5 • 100 • 12
aisi-whitebox/prompted_sandbagging_nemotron_llama_3_3_nemotron_super_49b_v1_swe_bench_verified_mini Viewer • Updated Jun 5 • 20 • 3
aisi-whitebox/saving_test2_prompted_sandbagging_llama_31_8b_instruct_ARC-Challenge Viewer • Updated Jun 4 • 20 • 1
aisi-whitebox/saving_test2_prompted_sandbagging_llama_31_8b_instruct_mmlu_0_shot Viewer • Updated Jun 4 • 20 • 1
aisi-whitebox/saving_test2_prompted_sandbagging_llama_31_8b_instruct_ARC-Challenge_cot Viewer • Updated Jun 4 • 15 • 1
aisi-whitebox/saving_test2_prompted_sandbagging_llama_31_8b_instruct_mmlu_0_shot_cot Viewer • Updated Jun 4 • 18 • 1
aisi-whitebox/saving_test_prompted_sandbagging_llama_31_8b_instruct_ARC-Challenge Viewer • Updated Jun 4 • 20 • 1
aisi-whitebox/saving_test_prompted_sandbagging_llama_31_8b_instruct_mmlu_0_shot Viewer • Updated Jun 4 • 20 • 1
aisi-whitebox/saving_test_prompted_sandbagging_llama_31_8b_instruct_ARC-Challenge_cot Viewer • Updated Jun 4 • 17 • 1
aisi-whitebox/saving_test_prompted_sandbagging_llama_31_8b_instruct_mmlu_0_shot_cot Viewer • Updated Jun 4 • 12 • 1
aisi-whitebox/prompted_sandbagging_nemotron_llama_3_3_nemotron_super_49b_v1_gdm_intercode_ctf Viewer • Updated Jun 3 • 20 • 3
aisi-whitebox/mmlu_0_shot_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 1k
aisi-whitebox/mmlu_0_shot_cot_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 1k
aisi-whitebox/wmdp_chem_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 816 • 1
aisi-whitebox/wmdp_chem_cot_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 816 • 1
aisi-whitebox/wmdp_bio_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 1k • 3
aisi-whitebox/wmdp_bio_cot_new_merged_mo2_6epc_finetuned_sandbagging_follow_up_q Viewer • Updated May 27 • 1k • 2