add simpler instructions on how to put this model together
Browse files
README.md
CHANGED
@@ -104,6 +104,33 @@ for sentence in output:
|
|
104 |
print(text)
|
105 |
```
|
106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
107 |
# Original code
|
108 |
|
109 |
The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).
|
|
|
104 |
print(text)
|
105 |
```
|
106 |
|
107 |
+
# To use this as a normal HuggingFace model
|
108 |
+
|
109 |
+
If you want to use this model with HF Trainer, here is a quick way to do that:
|
110 |
+
|
111 |
+
1. Download nvidia checkpoint:
|
112 |
+
```
|
113 |
+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_lm_345m/versions/v0.0/zip -O megatron_lm_345m_v0.0.zip
|
114 |
+
```
|
115 |
+
|
116 |
+
2. Convert:
|
117 |
+
```
|
118 |
+
python /src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
|
119 |
+
```
|
120 |
+
|
121 |
+
3. Fetch missing files
|
122 |
+
```
|
123 |
+
git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
|
124 |
+
```
|
125 |
+
|
126 |
+
4. Move the converted files into the cloned model dir
|
127 |
+
```
|
128 |
+
mv config.json pytorch_model.bin megatron-gpt2-345m/
|
129 |
+
```
|
130 |
+
|
131 |
+
5. The `megatron-gpt2-345m` dir should now have all the files which can be passed to HF Trainer as `--model_name_or_path megatron-gpt2-345m`
|
132 |
+
|
133 |
+
|
134 |
# Original code
|
135 |
|
136 |
The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).
|