nanotron
/

mistral-nanotron

Model card Files Files and versions

thomwolf HF Staff commited on Jan 31, 2024

Commit

9d018f5

·

1 Parent(s): 0c6f487

explain better

Files changed (1) hide show

README.md +10 -1

README.md CHANGED Viewed

@@ -6,9 +6,11 @@ library_name: nanotron
 Modeling code for Mistral to use with [Nanotron](https://github.com/huggingface/nanotron/)
 ## 🚀 Quickstart
-```python
 # Generate a config file
 python config_tiny_mistral.py
@@ -17,6 +19,13 @@ export CUDA_DEVICE_MAX_CONNECTIONS=1 # important for some distributed operations
 torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
 ```
 ## 🚀 Use your custom model
 - Update the `MistralConfig` class in `config_tiny_mistral.py` to match your model's configuration

 Modeling code for Mistral to use with [Nanotron](https://github.com/huggingface/nanotron/)
+Also contains converted pretrained weights for Mistral-7B-0.1: https://huggingface.co/mistralai/Mistral-7B-v0.1
 ## 🚀 Quickstart
+```bash
 # Generate a config file
 python config_tiny_mistral.py
 torchrun --nproc_per_node=8 run_train.py --config-file config_tiny_mistral.yaml
 ```
+## 🚀 Run generation with pretrained Mistral-7B-0.1
+```bash
+export CUDA_DEVICE_MAX_CONNECTIONS=1
+torchrun --nproc_per_node=1 run_generate.py --ckpt-path ./pretrained/Mistral-7B-v0.1
+```
 ## 🚀 Use your custom model
 - Update the `MistralConfig` class in `config_tiny_mistral.py` to match your model's configuration