5CD-AI
/

visocial-T5-base

Text2Text Generation

Transformers

Safetensors

mt5

Inference Endpoints

Model card Files Files and versions Community

htdung167 commited on May 30, 2024

Commit

99af8ed

verified ·

1 Parent(s): ec05cf7

Update README.md

Browse files

Files changed (1) hide show

README.md +15 -17

README.md CHANGED Viewed

@@ -6,10 +6,12 @@ tags: []
 # 5CD-AI/visocial-T5-base
 ## Overview
 <!-- Provide a quick summary of what the model is/does. -->
-<!-- We continually pretrain `uitnlp/visobert` on a merged 14GB dataset, the training dataset includes:
 - Internal data (100M comments and 15M posts on Facebook)
 - UIT data, which is used to pretrain `uitnlp/visobert`
-- MC4 ecommerce -->
 Here are the results on 3 downstream tasks on Vietnamese social media texts, including Hate Speech Detection(UIT-HSD), Toxic Speech Detection(ViCTSD), Hate Spans Detection(ViHOS):
 <table>
@@ -127,21 +129,17 @@ model_path = "5CD-AI/visobert-14gb-corpus"
 mask_filler = pipeline("fill-mask", model_path)
 mask_filler("shop làm ăn như cái <mask>", top_k=10)
-```
 ## Fine-tune Configuration
-We fine-tune `5CD-AI/visobert-14gb-corpus` on 4 downstream tasks with `transformers` library with the following configuration:
 - seed: 42
-- gradient_accumulation_steps: 1
-- weight_decay: 0.01
-- optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
-- training_epochs: 30
-- model_max_length: 128
-- learning_rate: 1e-5
-- metric_for_best_model: wf1
-- strategy: epoch
-And different additional configurations for each task:
-| Emotion Recognition                                                               | Hate Speech Detection                                                             | Spam Reviews Detection                                                            | Hate Speech Spans Detection                                                       |
-| --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
-|\- train_batch_size: 64<br>\- lr_scheduler_type: linear | \- train_batch_size: 32<br>\- lr_scheduler_type: linear | \- train_batch_size: 32<br>\- lr_scheduler_type: cosine | \- train_batch_size: 32<br>\- lr_scheduler_type: cosine | -->

 # 5CD-AI/visocial-T5-base
 ## Overview
 <!-- Provide a quick summary of what the model is/does. -->
+We continually pretrain `google/mt5-base` on a merged 20GB dataset, the training dataset includes:
 - Internal data (100M comments and 15M posts on Facebook)
 - UIT data, which is used to pretrain `uitnlp/visobert`
+- MC4 ecommerce
+- 10.7M comments on VOZ Forum from `tarudesu/VOZ-HSD`
+- 3.6M reviews from Amazon translated into Vietnamese from `5CD-AI/Vietnamese-amazon_polarity-gg-translated`
 Here are the results on 3 downstream tasks on Vietnamese social media texts, including Hate Speech Detection(UIT-HSD), Toxic Speech Detection(ViCTSD), Hate Spans Detection(ViHOS):
 <table>
 mask_filler = pipeline("fill-mask", model_path)
 mask_filler("shop làm ăn như cái <mask>", top_k=10)
+``` -->
 ## Fine-tune Configuration
+We fine-tune `5CD-AI/visocial-T5-base` on 3 downstream tasks with `transformers` library with the following configuration:
 - seed: 42
+- training_epochs: 4
+- train_batch_size: 4
+- gradient_accumulation_steps: 8
+- learning_rate: 3e-4
+- lr_scheduler_type: linear
+- model_max_length: 256
+- metric_for_best_model: eval_loss
+- evaluation_strategy: steps
+- eval_steps=0.1