eval
Browse files
README.md
CHANGED
|
@@ -33,15 +33,9 @@ Previous versions remain available in the repository. New models will be release
|
|
| 33 |
|
| 34 |
## Evaluation
|
| 35 |
|
| 36 |
-
| Model | Avg | ARC | HS | MMLU | TQA |
|
| 37 |
-
|
| 38 |
-
| **Shining Valiant 1.
|
| 39 |
-
| Llama 2 | 67.35 | 67.32 | 87.33 | 69.83 | 44.92 |
|
| 40 |
-
| Llama 2 Chat | 66.80 | 64.59 | 85.88 | 63.91 | 52.80 |
|
| 41 |
-
|
| 42 |
-
**Shining Valiant 1.3** is awaiting full results from the Open LLM Leaderboard.
|
| 43 |
-
|
| 44 |
-
SV 1.3 outperformed SV 1.2 on our internal testing.
|
| 45 |
|
| 46 |
## Prompting Guide
|
| 47 |
Shining Valiant uses the same prompt format as Llama 2 Chat - feel free to use your existing prompts and scripts!
|
|
|
|
| 33 |
|
| 34 |
## Evaluation
|
| 35 |
|
| 36 |
+
| Model | Avg | ARC | HS | MMLU | TQA | WG | GSM |
|
| 37 |
+
|-----------------------|--------|-------|-------|--------|-------|-------|-------|
|
| 38 |
+
| **Shining Valiant 1.3** | 73.78 | 71.33 | 90.96 | 71.21 | 70.29 | 84.21 | 54.66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
## Prompting Guide
|
| 41 |
Shining Valiant uses the same prompt format as Llama 2 Chat - feel free to use your existing prompts and scripts!
|