appvoid commited on
Commit
e5ffce7
·
verified ·
1 Parent(s): 1c8a2ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -12,11 +12,11 @@ tags:
12
  Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using an internal model `palmer-003` as base, biased as an assistant, using dpo technique, without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.
13
 
14
  ```
15
- MMLU ARC-C OBQA HellaSwag PIQA Winogrande Average Parameters
16
- tinyllama | 0.2577 | 0.3029 | 0.3600 | 0.5935 | 0.7329 | 0.5959 | 0.4738 | 1.1B |
17
- zyte | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 | 1.1B |
18
- palmer | 0.2523 | 0.3439 | 0.3740 | 0.6208 | 0.7524 | 0.6590 | 0.5004 | 1.1B |
19
- qwen | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 | 1.8B |
20
  ```
21
 
22
  This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-003 follows the same philosophy as palmer-002.5 to become a more powerful model with more data instead of less.
 
12
  Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using an internal model `palmer-003` as base, biased as an assistant, using dpo technique, without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.
13
 
14
  ```
15
+ Model MMLU ARC-C OBQA HellaSwag PIQA Winogrande Average Params
16
+ tinyllama | 0.2577 | 0.3029 | 0.3600 | 0.5935 | 0.7329 | 0.5959 | 0.4738 | 1.1B |
17
+ zyte | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 | 1.1B |
18
+ palmer | 0.2523 | 0.3439 | 0.3740 | 0.6208 | 0.7524 | 0.6590 | 0.5004 | 1.1B |
19
+ qwen | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 | 1.8B |
20
  ```
21
 
22
  This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-003 follows the same philosophy as palmer-002.5 to become a more powerful model with more data instead of less.