# Benchmarking This directory contains scripts and resources for evaluating LLM alignment and safety using benchmarks like MACCHIAVELLI, SALAD-bench, etc. ## Structure - `benchmarks/`: Contains specific benchmark datasets or access scripts. - `evaluation_scripts/`: Scripts to run the models against the benchmarks. - `results/`: Stores the output/results from benchmark runs. ## Usage (Instructions on how to run evaluations will go here)