nvidia
/

megatron-gpt2-345m

stas commited on May 11, 2021

Commit

10c4153

1 Parent(s): 0efaa23

add simpler instructions on how to put this model together

Files changed (1) hide show

README.md CHANGED Viewed

@@ -104,6 +104,33 @@ for sentence in output:
     print(text)
 ```
 # Original code
 The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).

     print(text)
 ```
+# To use this as a normal HuggingFace model
+If you want to use this model with HF Trainer, here is a quick way to do that:
+1. Download nvidia checkpoint:
+```
+wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_lm_345m/versions/v0.0/zip -O megatron_lm_345m_v0.0.zip
+```
+2. Convert:
+```
+python /src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
+```
+3. Fetch missing files
+```
+git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
+```
+4. Move the converted files into the cloned model dir
+```
+mv config.json pytorch_model.bin megatron-gpt2-345m/
+```
+5. The `megatron-gpt2-345m` dir should now have all the files which can be passed to HF Trainer as  `--model_name_or_path megatron-gpt2-345m`
 # Original code
 The original Megatron code can be found here: [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM).