Triangle104 commited on
Commit
ff73d27
·
verified ·
1 Parent(s): 280278f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md CHANGED
@@ -16,6 +16,92 @@ tags:
16
  This model was converted to GGUF format from [`SakanaAI/TAID-LLM-1.5B`](https://huggingface.co/SakanaAI/TAID-LLM-1.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/SakanaAI/TAID-LLM-1.5B) for more details on the model.
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Use with llama.cpp
20
  Install llama.cpp through brew (works on Mac and Linux)
21
 
 
16
  This model was converted to GGUF format from [`SakanaAI/TAID-LLM-1.5B`](https://huggingface.co/SakanaAI/TAID-LLM-1.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/SakanaAI/TAID-LLM-1.5B) for more details on the model.
18
 
19
+ ---
20
+ TAID-LLM-1.5B is an English language model created
21
+ through TAID (Temporally Adaptive Interpolated Distillation), our new
22
+ knowledge distillation method.
23
+ We used Qwen2-72B-Instruct as the teacher model and Qwen2-1.5B-Instruct as the student model.
24
+
25
+
26
+
27
+
28
+
29
+
30
+
31
+ Model Details
32
+
33
+
34
+
35
+
36
+ Developed by: Sakana AI
37
+ Model type: Autoregressive Language Model
38
+ Language(s): English
39
+ License: Apache License, Version 2.0
40
+ Paper: https://arxiv.org/abs/2501.16937
41
+ Blog: https://sakana.ai/taid/
42
+
43
+
44
+
45
+
46
+
47
+
48
+
49
+ Uses
50
+
51
+
52
+
53
+
54
+ This model is provided for research and development purposes only and
55
+ should be considered as an experimental prototype.
56
+ It is not intended for commercial use or deployment in mission-critical
57
+ environments.
58
+ Use of this model is at the user's own risk, and its performance and
59
+ outcomes are not guaranteed.
60
+ Sakana AI shall not be liable for any direct, indirect, special,
61
+ incidental, or consequential damages, or any loss arising from the use
62
+ of this model, regardless of the results obtained.
63
+ Users must fully understand the risks associated with the use of this
64
+ model and use it at their own discretion.
65
+
66
+
67
+
68
+
69
+
70
+
71
+
72
+ Acknowledgement
73
+
74
+
75
+
76
+
77
+ We would like to thank the developers of the source models for their contributions and for making their work available.
78
+
79
+
80
+ This model is based on results obtained from a project, JPNP20017,
81
+ subsidized by the New Energy and Industrial Technology Development
82
+ Organization (NEDO).
83
+
84
+
85
+
86
+
87
+
88
+
89
+
90
+ Citation
91
+
92
+
93
+
94
+
95
+ @misc{sakana2025taid,
96
+ title = {TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models},
97
+ author. = {Makoto Shing and Kou Misaki and Han Bao and Sho Yokoi and Takuya Akiba},
98
+ year = {2025},
99
+ eprint = {2501.16937},
100
+ archivePrefix = {arXiv},
101
+ primaryClass = {cs.LG},
102
+ url = {https://arxiv.org/abs/2501.16937}
103
+
104
+ ---
105
  ## Use with llama.cpp
106
  Install llama.cpp through brew (works on Mac and Linux)
107