File size: 1,042 Bytes
4e62abb
 
3c2efd2
 
 
 
 
 
4e62abb
3c2efd2
c89deca
 
 
 
 
c4908fb
 
c89deca
 
 
c4908fb
c89deca
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: apache-2.0
language:
- zh
library_name: transformers
tags:
- Roberta
- Chinese Pre-trained Language Model
---

Please use 'XLMRoberta' related functions to load this model!

# MigBERT | 中文混合粒度预训练模型
[Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models](https://arxiv.org/abs/2303.10893)

# Demo | 使用样例
https://github.com/xnliang98/MigBERT

# Citation
如果你觉得我们的工作对你有用,请在您的工作中引用我们的文章。

If you find our resource or paper is useful, please consider including the following citation in your paper.

```
@misc{liang2023character,
      title={Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models}, 
      author={Xinnian Liang and Zefan Zhou and Hui Huang and Shuangzhi Wu and Tong Xiao and Muyun Yang and Zhoujun Li and Chao Bian},
      year={2023},
      eprint={2303.10893},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```