wanng commited on
Commit
cebb3f0
·
1 Parent(s): 470983a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -2,9 +2,22 @@
2
  language: zh
3
  tags:
4
  - summarization
 
5
  inference: False
6
  ---
7
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
9
 
10
  Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.
 
2
  language: zh
3
  tags:
4
  - summarization
5
+ - chinese
6
  inference: False
7
  ---
8
 
9
+ # Randeng-Pegasus-238M-Summary-Chinese
10
+
11
+ - Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
12
+ - Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
13
+
14
+ ## 简介 Brief Introduction
15
+
16
+ 善于处理摘要任务的,中文版的PAGASUS-base。
17
+
18
+ Good at solving text summarization tasks, Chinese PAGASUS-base.
19
+
20
+
21
  IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese model (Chinese) has 238M million parameter, pretrained on 180G Chinese data with GSG task which is stochastically sample important sentences with sampled gap sentence ratios by 25%. The pretraining task just as same as the paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization mentioned.
22
 
23
  Different from the English version of pegasus, considering that the Chinese sentence piece is unstable, we use jieba and Bertokenizer as the tokenizer in chinese pegasus model.