Update README.md
Browse files
README.md
CHANGED
@@ -88,17 +88,6 @@ This model was developed based on [NVIDIA-Nemotron-Nano-12B-v2-Base](https://hug
|
|
88 |
|
89 |
*The model was pruned and distilled from [NVIDIA-Nemotron-Nano-12B-v2-Base](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-Base) with \~480B tokens.*
|
90 |
|
91 |
-
## Computational load
|
92 |
-
|
93 |
-
Cumulative compute : 1.48E+24 FLOPS
|
94 |
-
|
95 |
-
Estimate energy and emissions for model training: 724.0 MWh
|
96 |
-
|
97 |
-
| | \# of tokens | Compute \[FLOPS\] | Energy \[MWh\] |
|
98 |
-
| :---- | :---- | :---- | :---- |
|
99 |
-
| 12B Base Pre-training | 20T | 1.45E+24 | 708.3 |
|
100 |
-
| 9B Pruning & Distillation | 480B | 3.28E+22 | 15.7 |
|
101 |
-
| Total | 20.6T | 1.48E+24 | 724.0 |
|
102 |
|
103 |
## Input
|
104 |
|
|
|
88 |
|
89 |
*The model was pruned and distilled from [NVIDIA-Nemotron-Nano-12B-v2-Base](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-Base) with \~480B tokens.*
|
90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
|
92 |
## Input
|
93 |
|