IDEA-CCNL
/

Randeng-Pegasus-238M-Summary-Chinese

text2text-generation

Model card Files Files and versions Community

wanng commited on Sep 22, 2022

Commit

cebb3f0

·

1 Parent(s): 470983a

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -2,9 +2,22 @@
 language: zh
 tags:
 - summarization
 inference: False
 ---
 IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
 Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.

 language: zh
 tags:
 - summarization
+- chinese
 inference: False
 ---
+# Randeng-Pegasus-238M-Summary-Chinese
+- Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
+- Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
+## 简介 Brief Introduction
+善于处理摘要任务的，中文版的PAGASUS-base。
+Good at solving text summarization tasks, Chinese PAGASUS-base.
 IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
 Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.