uer
/

chinese_roberta_L-2_H-128

Inference Endpoints

Model card Files Files and versions Community

uer commited on Dec 29, 2020

Commit

057298c

·

1 Parent(s): ef820fc

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -96,7 +96,7 @@ output = model(encoded_input)
 ## Training data
-CLUECorpusSmall is used as training data. We found that models pre-trained on CLUECorpusSmall outperform those pre-trained on CLUECorpus2020, although CLUECorpus2020 is much larger than CLUECorpusSmall.
 ## Training procedure
@@ -142,6 +142,13 @@ python3 pretrain.py --dataset_path cluecorpussmall_seq512_dataset.pt \
 					--tie_weights --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm
 ```
 ### BibTeX entry and citation info
 ```

 ## Training data
+[CLUECorpusSmall](https://github.com/CLUEbenchmark/CLUECorpus2020/) is used as training data. We found that models pre-trained on CLUECorpusSmall outperform those pre-trained on CLUECorpus2020, although CLUECorpus2020 is much larger than CLUECorpusSmall.
 ## Training procedure
 					--tie_weights --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm
 ```
+Finally, we convert the pre-trained model into Huggngface's format:
+```
+python3 scripts/convert_bert_from_uer_to_huggingface.py --input_model_path pytorch_model.bin
+                                                        --output_model_path huggingface_model.bin
+                                                        --layers_num 12 --target mlm
+```
 ### BibTeX entry and citation info
 ```