File size: 2,771 Bytes
8e7ad21 82c12f1 8e7ad21 cdffa66 82c12f1 cdffa66 82c12f1 c1ddf01 82c12f1 c1ddf01 1f0bb18 c1ddf01 82c12f1 c1ddf01 82c12f1 ad0a551 82c12f1 ad0a551 82c12f1 ad0a551 82c12f1 51e604b ec4b402 c92d23b ec4b402 82c12f1 ec4b402 82c12f1 ec4b402 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
license: mit
datasets:
- xnli
language:
- fr
metrics:
- accuracy
pipeline_tag: zero-shot-classification
---
# XLM-ROBERTA-BASE-XNLI_FR
## Model description
This model takes the XLM-Roberta-base model which has been continued to pre-traine on a large corpus of Twitter in multiple languages.
It was developed following a similar strategy as introduced as part of the [Tweet Eval](https://github.com/cardiffnlp/tweeteval) framework.
The model is further finetuned on the french part of the XNLI training dataset.
## Intended Usage
This model was developed to do Zero-Shot Text Classification in the realm of Hate Speech Detection. It is focused on the language of french as it was finetuned on data in saild languages. Since the base model was pre-trained on 100 different languages it has shown some effectiveness in other languages. Please refer to the list of languages in the [XLM Roberta paper](https://arxiv.org/abs/1911.02116)
### Usage with Zero-Shot Classification pipeline
```python
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="morit/french_xlm_xnli")
```
After loading the model you can classify sequences in the languages mentioned above. You can specify your sequences and a matching hypothesis to be able to classify your proposed candidate labels.
```python
sequence_to_classify = "Je pense que Marcon va gagner les elections?"
# we can specify candidate labels and hypothesis:
candidate_labels = ["politique", "sport"]
hypothesis_template = "Cet example est {}"
# classify using the information provided
classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)
# Output
#{'sequence': 'Je pense que Marcon va gagner les elections?',
#'labels': ['politique', 'sport'],
#'scores': [0.8195879459381104, 0.18041200935840607]}
```
## Training
This model was pre-trained on a set of 100 languages and follwed further training on 198M multilingual tweets as described in the original [paper](https://arxiv.org/abs/2104.12250). Further it was trained on the training set of XNLI dataset in french which is a machine translated version of the MNLI dataset. It was trained on 5 epochs of the XNLI train set and evaluated on the XNLI eval dataset at the end of every epoch to find the best performing model. The model which had the highest accuracy on the eval set was chosen at the end.
![Training Charts from wandb](screen_wandb.png)
- learning rate: 2e-5
- batch size: 32
- max sequence: length 128
using a GPU (NVIDIA GeForce RTX 3090) resulting in a training time of 1h 47 mins.
## Evaluation
The best performing model was evaluatated on the XNLI test set to get a comparable result
```
predict_accuracy = 78.02 %
``` |