IDEA-CCNL
/

Randeng-Pegasus-238M-Summary-Chinese

text2text-generation

Model card Files Files and versions Community

dongxiaoqun commited on Jul 12, 2022

Commit

b9f35e9

·

1 Parent(s): 0e66ef9

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ tags:
 inference: False
 ---
-IDEA-CCNL/Randeng_Pegasus_238M_Summary_Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
 Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.

 inference: False
 ---
+IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
 Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.