Update README.md
Browse files
README.md
CHANGED
@@ -2,13 +2,29 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
# LLM360 Research Suite: K2 Loss Spike 1
|
5 |
-
During the first K2 training phase, we encountered two loss spikes.
|
6 |
|
7 |
<img src="k2_spike_1.png" alt="k2 spike 1"/>
|
8 |
|
9 |
# Purpose
|
10 |
Loss spikes are still a relatively unknown phenomena. By making these spikes and associated training details available, we hope others use these artifacts to further the worlds knowledge on this topic.
|
11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
## About the LLM360 Research Suite
|
13 |
The LLM360 Research Suite is a comprehensive set of large language model (LLM) artifacts from Amber, CrystalCoder, and K2 for academic and industry researchers to explore LLM training dynamics. Additional resources can be found at llm360.ai.
|
14 |
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
# LLM360 Research Suite: K2 Loss Spike 1
|
5 |
+
During the first K2 training phase, we encountered two loss spikes. This repo contains 34 checkpoints that capture the training dynamics during the loss spikes.
|
6 |
|
7 |
<img src="k2_spike_1.png" alt="k2 spike 1"/>
|
8 |
|
9 |
# Purpose
|
10 |
Loss spikes are still a relatively unknown phenomena. By making these spikes and associated training details available, we hope others use these artifacts to further the worlds knowledge on this topic.
|
11 |
|
12 |
+
## First 10 Checkpoints
|
13 |
+
| Checkpoints | |
|
14 |
+
| ----------- | ----------- |
|
15 |
+
| [Checkpoint 160](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_160) | [Checkpoint 170](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_170) |
|
16 |
+
| [Checkpoint 162](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_162) | [Checkpoint 172](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_172) |
|
17 |
+
| [Checkpoint 164](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_164) | [Checkpoint 174](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_174) |
|
18 |
+
| [Checkpoint 166](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_166) | [Checkpoint 176](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_176) |
|
19 |
+
| [Checkpoint 168](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_168) | [Checkpoint 178](https://huggingface.co/LLM360/K2-Spike-1/tree/spike_ckpt_178) |
|
20 |
+
|
21 |
+
[to find all branches: git branch -a]
|
22 |
+
|
23 |
+
## Loss Spike's on the LLM360 Evaluation Suite
|
24 |
+
|
25 |
+
something here
|
26 |
+
|
27 |
+
|
28 |
## About the LLM360 Research Suite
|
29 |
The LLM360 Research Suite is a comprehensive set of large language model (LLM) artifacts from Amber, CrystalCoder, and K2 for academic and industry researchers to explore LLM training dynamics. Additional resources can be found at llm360.ai.
|
30 |
|