Update README.md
Browse files
README.md
CHANGED
@@ -108,9 +108,23 @@ model-index:
|
|
108 |
|
109 |
---
|
110 |
|
111 |
-
# Uploaded model
|
112 |
|
113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
- **License:** apache-2.0
|
115 |
- **Finetuned from model :** unsloth/gemma-2-9b-it
|
116 |
|
@@ -118,10 +132,74 @@ This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslot
|
|
118 |
|
119 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
120 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
121 |
|
122 |
|
|
|
|
|
123 |
|
124 |
-
|
|
|
|
|
125 |
|
126 |
-
|
127 |
-
(
|
|
|
108 |
|
109 |
---
|
110 |
|
|
|
111 |
|
112 |
+
|
113 |
+
## Introduction
|
114 |
+
|
115 |
+
N3N_gemma-2-9b-it_20241029_1532 is a 10.2 billion parameter open-source model built upon Gemma2-9B-Instruct through additional training. What sets this model apart is its fine-tuning process using a high-quality dataset derived from 1.6 million arXiv papers.
|
116 |
+
|
117 |
+
- **High-quality Dataset**: The model has been fine-tuned using a comprehensive dataset compiled from 1.6 million arXiv papers, ensuring robust performance across various real-world applications.
|
118 |
+
|
119 |
+
- **Superior Reasoning Capabilities**: The model demonstrates exceptional performance in mathematical reasoning and complex problem-solving tasks, outperforming comparable models in these areas.
|
120 |
+
|
121 |
+
This model represents our commitment to advancing language model capabilities through meticulous dataset preparation and continuous model enhancement.
|
122 |
+
|
123 |
+
---
|
124 |
+
|
125 |
+
|
126 |
+
# nhyha/N3N_gemma-2-9b-it_20241029_1532
|
127 |
+
|
128 |
- **License:** apache-2.0
|
129 |
- **Finetuned from model :** unsloth/gemma-2-9b-it
|
130 |
|
|
|
132 |
|
133 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
134 |
|
135 |
+
**Scoring #1 LLM of 7B and 12B at 08.11.2024.**
|
136 |
+
|
137 |
+
|
138 |
+
|
139 |
+
|
140 |
+
## Quickstart
|
141 |
+
|
142 |
+
Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate content.
|
143 |
+
|
144 |
+
```python
|
145 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
146 |
+
device = "cuda" # the device to load the model onto
|
147 |
+
|
148 |
+
model = AutoModelForCausalLM.from_pretrained(
|
149 |
+
"nhyha/N3N_gemma-2-9b-it_20241029_1532",
|
150 |
+
torch_dtype="auto",
|
151 |
+
device_map="auto"
|
152 |
+
)
|
153 |
+
tokenizer = AutoTokenizer.from_pretrained("nhyha/N3N_gemma-2-9b-it_20241029_1532")
|
154 |
+
|
155 |
+
prompt = "Give me a short introduction to large language model."
|
156 |
+
messages = [
|
157 |
+
{"role": "system", "content": "You are a helpful assistant."},
|
158 |
+
{"role": "user", "content": prompt}
|
159 |
+
]
|
160 |
+
text = tokenizer.apply_chat_template(
|
161 |
+
messages,
|
162 |
+
tokenize=False,
|
163 |
+
add_generation_prompt=True
|
164 |
+
)
|
165 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(device)
|
166 |
+
|
167 |
+
generated_ids = model.generate(
|
168 |
+
model_inputs.input_ids,
|
169 |
+
max_new_tokens=512
|
170 |
+
)
|
171 |
+
generated_ids = [
|
172 |
+
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
173 |
+
]
|
174 |
+
|
175 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
176 |
+
```
|
177 |
+
|
178 |
+
|
179 |
+
|
180 |
+
|
181 |
+
|
182 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
183 |
+
|
184 |
+
|
185 |
+
| Metric |Value|
|
186 |
+
|-------------------|----:|
|
187 |
+
|Avg. |32.02|
|
188 |
+
|IFEval (0-Shot) |67.52|
|
189 |
+
|BBH (3-Shot) |40.99|
|
190 |
+
|MATH Lvl 5 (4-Shot)|20.47|
|
191 |
+
|GPQA (0-shot) |12.08|
|
192 |
+
|MuSR (0-shot) |16.39|
|
193 |
+
|MMLU-PRO (5-shot) |34.69|
|
194 |
+
|
195 |
|
196 |
|
197 |
+
## Contact
|
198 |
+
If you are interested in customized LLMs for business applications powered by Jikji Labs' cutting-edge infrastructure, please get in contact with us via our website. Jikji Labs is designed to support large-scale data processing and model training, ensuring optimal solutions for your business needs. We are also grateful for your feedback and suggestions as we strive to improve and innovate continuously.
|
199 |
|
200 |
+
## Collaborations
|
201 |
+
We are also keenly seeking support and investment as we continue to advance the development of robust language models, with a strong emphasis on creating high-quality and specialized datasets to address a diverse range of purposes and requirements. Our expertise in dataset generation enables us to develop models that are more accurate and tailored to specific business needs. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us through our website.
|
202 |
+
(https://www.n3n.ai/)
|
203 |
|
204 |
+
## Acknowledgement
|
205 |
+
Many thanks to [google](https://huggingface.co/google) for providing such a valuable model to the Open-Source community.
|