suayptalha commited on
Commit
c8e2f4a
·
verified ·
1 Parent(s): b050a1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -14,7 +14,6 @@ tags:
14
  - sft
15
  ---
16
 
17
-
18
  <!DOCTYPE html>
19
  <style>
20
  ebody {
@@ -607,6 +606,8 @@ a:hover .link-arrow {
607
  Maestro-R1-Llama-8B is a powerful language model fine-tuned from DeepSeek-R1-Distill-Llama-8B, a distilled model based on the Llama-3 architecture. DeepSeek-R1-Distill-Llama-8B itself is derived from the Llama-3 architecture, with a distillation process from DeepSeek-R1, utilizing a large corpus of diverse data. This distillation enables the model to retain strong reasoning capabilities while maintaining a smaller parameter count.
608
  <br>
609
  Maestro-R1-Llama-8B builds on this foundation, further enhancing its performance through fine-tuning on the ServiceNow-AI/R1-Distill-SFT dataset. This fine-tuning step sharpens the model's ability to handle specialized tasks and improves its reasoning, problem-solving, and code generation capabilities. The combination of the distilled base model and domain-specific fine-tuning makes Maestro-R1-Llama-8B an efficient and robust model, excelling across a wide range of language tasks.
 
 
610
  </div>
611
  <div class="model-composition">
612
  <h4>Loss Graph</h4>
 
14
  - sft
15
  ---
16
 
 
17
  <!DOCTYPE html>
18
  <style>
19
  ebody {
 
606
  Maestro-R1-Llama-8B is a powerful language model fine-tuned from DeepSeek-R1-Distill-Llama-8B, a distilled model based on the Llama-3 architecture. DeepSeek-R1-Distill-Llama-8B itself is derived from the Llama-3 architecture, with a distillation process from DeepSeek-R1, utilizing a large corpus of diverse data. This distillation enables the model to retain strong reasoning capabilities while maintaining a smaller parameter count.
607
  <br>
608
  Maestro-R1-Llama-8B builds on this foundation, further enhancing its performance through fine-tuning on the ServiceNow-AI/R1-Distill-SFT dataset. This fine-tuning step sharpens the model's ability to handle specialized tasks and improves its reasoning, problem-solving, and code generation capabilities. The combination of the distilled base model and domain-specific fine-tuning makes Maestro-R1-Llama-8B an efficient and robust model, excelling across a wide range of language tasks.
609
+ <br>
610
+ DeepSeek-R1 Paper Link: https://arxiv.org/abs/2501.12948
611
  </div>
612
  <div class="model-composition">
613
  <h4>Loss Graph</h4>