appvoid
/

palmer-003

Text Generation

text-generation-inference

Model card Files Files and versions Community

appvoid commited on Jan 27, 2024

Commit

e5ffce7

·

verified ·

1 Parent(s): 1c8a2ad

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -12,11 +12,11 @@ tags:
 Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using an internal model `palmer-003` as base, biased as an assistant, using dpo technique, without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.
 ```
-	              MMLU     ARC-C    OBQA   HellaSwag  PIQA  Winogrande Average  Parameters
-tinyllama      | 0.2577 | 0.3029 | 0.3600 | 0.5935 | 0.7329 | 0.5959 | 0.4738 | 1.1B       |
-zyte 	       | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 | 1.1B       |
-palmer         | 0.2523 | 0.3439 | 0.3740 | 0.6208 | 0.7524 | 0.6590 | 0.5004 | 1.1B       |
-qwen           | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 | 1.8B       |
 ```
 This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-003 follows the same philosophy as palmer-002.5 to become a more powerful model with more data instead of less.

 Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using an internal model `palmer-003` as base, biased as an assistant, using dpo technique, without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.
 ```
+  Model      MMLU     ARC-C    OBQA   HellaSwag  PIQA  Winogrande Average  Params
+tinyllama | 0.2577 | 0.3029 | 0.3600 | 0.5935 | 0.7329 | 0.5959 | 0.4738 | 1.1B  |
+zyte 	  | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 | 1.1B  |
+palmer    | 0.2523 | 0.3439 | 0.3740 | 0.6208 | 0.7524 | 0.6590 | 0.5004 | 1.1B  |
+qwen      | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 | 1.8B  |
 ```
 This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-003 follows the same philosophy as palmer-002.5 to become a more powerful model with more data instead of less.