Update README.md
Browse files
README.md
CHANGED
|
@@ -28,6 +28,7 @@ Below we
|
|
| 28 |
|
| 29 |
## Distance in y Between Fine-Tuning and Training from Scratch
|
| 30 |
<img src="figures/tllama_test_distance.png" width="900"/>
|
|
|
|
| 31 |
|
| 32 |
## Training parameters
|
| 33 |
Not mentioned parameters are the same as for [TinyLLama-2.5T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T).
|
|
|
|
| 28 |
|
| 29 |
## Distance in y Between Fine-Tuning and Training from Scratch
|
| 30 |
<img src="figures/tllama_test_distance.png" width="900"/>
|
| 31 |
+
The distance |x1-x2| with same function value f1(x1)=f2(x2) grows with more steps. On convergence, it starts to rapidly increase (perhaps exponentially).
|
| 32 |
|
| 33 |
## Training parameters
|
| 34 |
Not mentioned parameters are the same as for [TinyLLama-2.5T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T).
|