explain better
Browse files
README.md
CHANGED
|
@@ -6,9 +6,11 @@ library_name: nanotron
|
|
| 6 |
|
| 7 |
Modeling code for Mistral to use with [Nanotron](https://github.com/huggingface/nanotron/)
|
| 8 |
|
|
|
|
|
|
|
| 9 |
## π Quickstart
|
| 10 |
|
| 11 |
-
```
|
| 12 |
# Generate a config file
|
| 13 |
python config_tiny_mistral.py
|
| 14 |
|
|
@@ -17,6 +19,13 @@ export CUDA_DEVICE_MAX_CONNECTIONS=1 # important for some distributed operations
|
|
| 17 |
torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
|
| 18 |
```
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
## π Use your custom model
|
| 21 |
|
| 22 |
- Update the `MistralConfig` class in `config_tiny_mistral.py` to match your model's configuration
|
|
|
|
| 6 |
|
| 7 |
Modeling code for Mistral to use with [Nanotron](https://github.com/huggingface/nanotron/)
|
| 8 |
|
| 9 |
+
Also contains converted pretrained weights for Mistral-7B-0.1: https://huggingface.co/mistralai/Mistral-7B-v0.1
|
| 10 |
+
|
| 11 |
## π Quickstart
|
| 12 |
|
| 13 |
+
```bash
|
| 14 |
# Generate a config file
|
| 15 |
python config_tiny_mistral.py
|
| 16 |
|
|
|
|
| 19 |
torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
|
| 20 |
```
|
| 21 |
|
| 22 |
+
## π Run generation with pretrained Mistral-7B-0.1
|
| 23 |
+
|
| 24 |
+
```bash
|
| 25 |
+
export CUDA_DEVICE_MAX_CONNECTIONS=1
|
| 26 |
+
torchrun --nproc_per_node=1 run_generate.py --ckpt-path ./pretrained/Mistral-7B-v0.1
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
## π Use your custom model
|
| 30 |
|
| 31 |
- Update the `MistralConfig` class in `config_tiny_mistral.py` to match your model's configuration
|