mkurman commited on
Commit
6d4a92e
·
verified ·
1 Parent(s): 44d22fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -109
README.md CHANGED
@@ -16,102 +16,6 @@ datasets:
16
  - ProlificAI/social-reasoning-rlhf
17
  - allenai/tulu-3-sft-mixture
18
  - allenai/llama-3.1-tulu-3-8b-preference-mixture
19
- pipeline_tag: text-generation
20
- model-index:
21
- - name: Llama-3.2-SUN-1B-Instruct
22
- results:
23
- - task:
24
- type: text-generation
25
- name: Text Generation
26
- dataset:
27
- name: IFEval (0-Shot)
28
- type: HuggingFaceH4/ifeval
29
- args:
30
- num_few_shot: 0
31
- metrics:
32
- - type: inst_level_strict_acc and prompt_level_strict_acc
33
- value: 64.13
34
- name: strict accuracy
35
- source:
36
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
37
- name: Open LLM Leaderboard
38
- - task:
39
- type: text-generation
40
- name: Text Generation
41
- dataset:
42
- name: BBH (3-Shot)
43
- type: BBH
44
- args:
45
- num_few_shot: 3
46
- metrics:
47
- - type: acc_norm
48
- value: 9.18
49
- name: normalized accuracy
50
- source:
51
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
52
- name: Open LLM Leaderboard
53
- - task:
54
- type: text-generation
55
- name: Text Generation
56
- dataset:
57
- name: MATH Lvl 5 (4-Shot)
58
- type: hendrycks/competition_math
59
- args:
60
- num_few_shot: 4
61
- metrics:
62
- - type: exact_match
63
- value: 4.61
64
- name: exact match
65
- source:
66
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
67
- name: Open LLM Leaderboard
68
- - task:
69
- type: text-generation
70
- name: Text Generation
71
- dataset:
72
- name: GPQA (0-shot)
73
- type: Idavidrein/gpqa
74
- args:
75
- num_few_shot: 0
76
- metrics:
77
- - type: acc_norm
78
- value: 0.0
79
- name: acc_norm
80
- source:
81
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
82
- name: Open LLM Leaderboard
83
- - task:
84
- type: text-generation
85
- name: Text Generation
86
- dataset:
87
- name: MuSR (0-shot)
88
- type: TAUR-Lab/MuSR
89
- args:
90
- num_few_shot: 0
91
- metrics:
92
- - type: acc_norm
93
- value: 4.05
94
- name: acc_norm
95
- source:
96
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
97
- name: Open LLM Leaderboard
98
- - task:
99
- type: text-generation
100
- name: Text Generation
101
- dataset:
102
- name: MMLU-PRO (5-shot)
103
- type: TIGER-Lab/MMLU-Pro
104
- config: main
105
- split: test
106
- args:
107
- num_few_shot: 5
108
- metrics:
109
- - type: acc
110
- value: 8.68
111
- name: accuracy
112
- source:
113
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=meditsolutions/Llama-3.2-SUN-1B-Instruct
114
- name: Open LLM Leaderboard
115
  ---
116
 
117
  # MedIT SUN HDIC 1B Instruct
@@ -154,16 +58,3 @@ As the model is still in training, performance and capabilities may vary. Users
154
 
155
  **Disclaimer and Safety Considerations**
156
  The Model is designed to be used as a smart assistant but not as a knowledge source within your applications, systems, or environments. It is not intended to provide 100% accurate answers, especially in scenarios where high precision and accuracy are
157
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
158
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_meditsolutions__Llama-3.2-SUN-1B-Instruct)
159
-
160
- | Metric |Value|
161
- |-------------------|----:|
162
- |Avg. |15.11|
163
- |IFEval (0-Shot) |64.13|
164
- |BBH (3-Shot) | 9.18|
165
- |MATH Lvl 5 (4-Shot)| 4.61|
166
- |GPQA (0-shot) | 0.00|
167
- |MuSR (0-shot) | 4.05|
168
- |MMLU-PRO (5-shot) | 8.68|
169
-
 
16
  - ProlificAI/social-reasoning-rlhf
17
  - allenai/tulu-3-sft-mixture
18
  - allenai/llama-3.1-tulu-3-8b-preference-mixture
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ---
20
 
21
  # MedIT SUN HDIC 1B Instruct
 
58
 
59
  **Disclaimer and Safety Considerations**
60
  The Model is designed to be used as a smart assistant but not as a knowledge source within your applications, systems, or environments. It is not intended to provide 100% accurate answers, especially in scenarios where high precision and accuracy are