Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ This repository contains our comprehensive AI safety evaluation PEFT adapter mod
|
|
| 21 |
|
| 22 |
## Model Performance
|
| 23 |
|
| 24 |
-
|
| 25 |
|
| 26 |
### Overall Performance
|
| 27 |
- **Total Accuracy: 81.90%** (86/105 correct predictions)
|
|
|
|
| 21 |
|
| 22 |
## Model Performance
|
| 23 |
|
| 24 |
+
The GroundedAI Phi-4-Mini-Judge model achieves strong performance across all three evaluation dimensions on a balanced test set of 105 samples (35 per task):
|
| 25 |
|
| 26 |
### Overall Performance
|
| 27 |
- **Total Accuracy: 81.90%** (86/105 correct predictions)
|