DTAI-KULeuven
/

robbert-v2-dutch-sentiment

Text Classification

Model card Files Files and versions Community

Pieter Delobelle commited on Jun 29, 2022

Commit

ce4aaf6

·

1 Parent(s): a038861

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ This is a finetuned model based on [RobBERT (v2)](https://huggingface.co/pdelobe
 # Training data and setup
 We used the [Dutch Book Reviews Dataset (DBRD)](https://huggingface.co/datasets/dbrd) from van der Burgh et al. (2019).
-Originally, these reviews got a five-star rating, but this has been converted to positive(⭐️⭐️⭐️⭐️ and ⭐️⭐️⭐️⭐️⭐️), neutral (⭐️⭐️⭐️) and negative (⭐️ and ⭐️⭐️).
 We used 19.5k reviews for the training set, 528 reviews for the validation set and 2224 to calculate the final accuracy.
 The validation set was used to evaluate a random hyperparameter search over the learning rate, weight decay and gradient accumulation steps.
@@ -47,6 +47,7 @@ The full training details are available in [`training_args.bin`](https://hugging
 # Limitations and biases
 - The domain of the reviews is limited to book reviews.
 - Most authors of the book reviews were women, which could have caused [a difference in performance for reviews written by men and women](https://www.aclweb.org/anthology/2020.findings-emnlp.292).
 ## Credits and citation

 # Training data and setup
 We used the [Dutch Book Reviews Dataset (DBRD)](https://huggingface.co/datasets/dbrd) from van der Burgh et al. (2019).
+Originally, these reviews got a five-star rating, but this has been converted to positive (⭐️⭐️⭐️⭐️ and ⭐️⭐️⭐️⭐️⭐️), neutral (⭐️⭐️⭐️) and negative (⭐️ and ⭐️⭐️).
 We used 19.5k reviews for the training set, 528 reviews for the validation set and 2224 to calculate the final accuracy.
 The validation set was used to evaluate a random hyperparameter search over the learning rate, weight decay and gradient accumulation steps.
 # Limitations and biases
 - The domain of the reviews is limited to book reviews.
 - Most authors of the book reviews were women, which could have caused [a difference in performance for reviews written by men and women](https://www.aclweb.org/anthology/2020.findings-emnlp.292).
+- This is _not_ the same model as we discussed in our paper, due to some conversion issues between the original training two years ago and now, it was easier to retrain this model. The accuracy is slightly lower, but the model was trained on the beginning of the reviews instead of the end of the reviews.
 ## Credits and citation