Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ We trimmed vocabulary size to 50,589 and continually pretrained `google/mt5-base
|
|
10 |
- Internal data (100M comments and 15M posts on Facebook)
|
11 |
- UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
|
12 |
- MC4 ecommerce
|
13 |
-
- 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`
|
14 |
- 3.6M reviews from Amazon[3] translated into Vietnamese from `5CD-AI/Vietnamese-amazon_polarity-gg-translated`
|
15 |
|
16 |
Here are the results on 3 downstream tasks on Vietnamese social media texts, including Hate Speech Detection(UIT-HSD), Toxic Speech Detection(ViCTSD), Hate Spans Detection(ViHOS):
|
|
|
10 |
- Internal data (100M comments and 15M posts on Facebook)
|
11 |
- UIT data[2], which is used to pretrain `uitnlp/visobert`[2]
|
12 |
- MC4 ecommerce
|
13 |
+
- 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`[7]
|
14 |
- 3.6M reviews from Amazon[3] translated into Vietnamese from `5CD-AI/Vietnamese-amazon_polarity-gg-translated`
|
15 |
|
16 |
Here are the results on 3 downstream tasks on Vietnamese social media texts, including Hate Speech Detection(UIT-HSD), Toxic Speech Detection(ViCTSD), Hate Spans Detection(ViHOS):
|