marcuscedricridia commited on
Commit
7ab402c
·
verified ·
1 Parent(s): 3787aba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -27
README.md CHANGED
@@ -11,38 +11,42 @@ tags:
11
  - merge
12
 
13
  ---
14
- # merge
15
 
16
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
- ## Merge Details
19
- ### Merge Method
 
 
20
 
21
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [marcuscedricridia/Hush-Qwen2.5-7B-RP-1M](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-RP-1M) as a base.
22
 
23
- ### Models Merged
24
 
25
- The following models were included in the merge:
26
- * [marcuscedricridia/Hush-Qwen2.5-7B-della1](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della1)
27
- * [marcuscedricridia/Hush-Qwen2.5-7B-della2](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della2)
28
- * [marcuscedricridia/Hush-Qwen2.5-7B-della3](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della3)
29
- * [marcuscedricridia/Hush-Qwen2.5-7B-della4](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della4)
30
 
31
- ### Configuration
 
32
 
33
- The following YAML configuration was used to produce this model:
 
 
 
 
 
 
 
 
 
34
 
35
- ```yaml
36
- merge_method: model_stock
37
- base_model: marcuscedricridia/Hush-Qwen2.5-7B-RP-1M
38
- models:
39
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della1
40
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della2
41
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della3
42
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della4
43
- dtype: bfloat16
44
- tokenizer_source: base
45
- int8_mask: true
46
- normalize: true
47
- name: Hush-Qwen2.5-7B-Preview
48
- ```
 
11
  - merge
12
 
13
  ---
14
+ # Model Card: Hush-Qwen2.5-7B-Preview
15
 
16
+ ## Model Details
17
 
18
+ - **Model Name:** Hush-Qwen2.5-7B-Preview
19
+ - **Creator:** marcuscedricridia
20
+ - **Merge Technique:** YoYo v3
21
+ - **Primary Focus:** General performance with a focus on improving benchmarks, especially IFEVAL.
22
 
23
+ ## Performance Highlights
24
 
25
+ Hush-Qwen2.5-7B-Preview was created using the YoYo v3 merge technique, achieving a new high on the IFEVAL test for 7B models with a score of **79.62%**. This makes it the **second-best** model in that category, though the leading model is currently unavailable, meaning we might be in **first place** by default!
26
 
27
+ ### Strengths
28
+ - **High IFEVAL Score:** 79.62%, among the best for 7B models.
29
+ - **Well-rounded performance:** Decent scores across various benchmarks.
 
 
30
 
31
+ ### Weaknesses
32
+ - **Low MATH Score:** 35%, which is significantly lower than our past models (which scored at least 45%). Improving this would make the model substantially better overall.
33
 
34
+ ## Benchmark Results
35
+ | Category | Score (%) |
36
+ |-----------|----------|
37
+ | Average | 35.13 |
38
+ | IFEVAL | 79.62 |
39
+ | BBH | 35.33 |
40
+ | MATH | 37.54 |
41
+ | GPQA | 8.17 |
42
+ | MUSR | 12.73 |
43
+ | MMLU | 37.38 |
44
 
45
+ ## Next Steps
46
+
47
+ - **Finetune on Math:** Bringing up the math score is a priority to create a well-balanced model.
48
+ - **Explore YoYo v4:** The next step could be merging this model with another one that is strong in math using the YoYo v4 technique. However, YoYo v4 lacks proper documentation, making it a challenge to implement.
49
+ - **Develop a Math-Strong Model:** An alternative approach is to build a new model that performs decently in all benchmarks but excels in math, then merge it with this one.
50
+
51
+ ## Conclusion
52
+ Hush-Qwen2.5-7B-Preview is a strong contender in the IFEVAL category, achieving one of the highest scores among 7B models. However, improving the math benchmark is a key priority for future iterations. By either finetuning or leveraging new merge techniques like YoYo v4, we can push the model to new heights.