File size: 5,227 Bytes

---
library_name: transformers
tags:
- detoxification
- text_style_transfer
license: openrail++
datasets:
- s-nlp/synthdetoxm
language:
- de
- es
- fr
- ru
base_model:
- bigscience/mt0-xl
pipeline_tag: text2text-generation
---

# mT0-XL (SynthDetoxM Full)


![image/png](https://cdn-uploads.huggingface.co/production/uploads/61ade264f602880813dbe10b/V-_UsUgqXy1BStg2G9SfS.png)

<!-- Provide a quick summary of what the model is/does. -->

This a fine-tune of [`bigscience/mt0-xl`](https://huggingface.co/bigscience/mt0-xl) model on multilingual text detoxification dataset [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) from the NAACL 2025 Main Track paper *SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators* by Daniil Moskovskiy et al. 

## Usage

The usage is similar to the 

```python
from transformers import pipeline

toxic_text = "Your toxic text goes here."

pipe = pipeline("text2text-generation", model="s-nlp/mt0-xl-detox-sdm-full")
pipe(f"Detoxify: {toxic_text}")

```

## Training Details

The model was fine-tuned for 2 epochs on [`s-nlp/synthdetoxm`](https://huggingface.co/datasets/s-nlp/synthdetoxm) dataset with full precision (FP32) using Adafactor optimizer with `1e-4` learning rate and batch size of `4` with gradient checkpointing enabled. The full training configuration is available below:

```json
{
    "do_train": true,
    "do_eval": true,
    "per_device_train_batch_size": 4,
    "per_device_eval_batch_size": 4,
    "learning_rate": 1e-4,
    "weight_decay": 0,
    "num_train_epochs": 2,
    "gradient_accumulation_steps": 1,
    "logging_strategy": "steps",
    "logging_steps": 1,
    "save_strategy": "epoch",
    "save_total_limit": 1,
    "warmup_steps": 1,
    "report_to": "wandb",
    "optim": "adafactor",
    "lr_scheduler_type": "linear",
    "predict_with_generate": true,
    "bf16": false,
    "gradient_checkpointing": true,
    "output_dir": "/path/",
    "seed": 42,
}

```

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

We use the multilingual detoxification evaluation setup from [TextDetox 2024 Multilingual Text Detoxification Shared Task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). 
Specifically, we use the following metrics:

- **Style Transfer Accuracy** (**STA**) is calculated with a [`textdetox/xlmr-large-toxicity-classifier`](https://huggingface.co/textdetox/xlmr-large-toxicity-classifier).
- **Text Similarity** (**SIM**) is calculated as a similarity of text embeddings given by a [`sentence-transformers/LaBSE`](https://huggingface.co/sentence-transformers/LaBSE) encoder.
- **Fluency** (**FL**) is calculated as a character n-gram F score - [ChrF1](https://github.com/m-popovic/chrF).

These metrics are aggregated in a final **Joint** metric (**J**):
 
$$\textbf{J} = \frac{1}{n}\sum\limits_{i=1}^{n}\textbf{STA}(y_i) \cdot \textbf{SIM}(x_i,y_i) \cdot \textbf{FL}(x_i, y_i)$$,

### Evaluation Results

This model was evaluated on the test set of [`textdetox/multilingual_paradetox`](https://huggingface.co/datasets/textdetox/multilingual_paradetox) dataset from [TextDetox 2024 Multilingual Text Detoxification Shared Task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). 
The results of the evaluation are presented below.

|                | **German** | **Spanish** | **Russian** |
|----------------|------------|-------------|-------------|
| **Human References** | 0.733     | 0.709       | 0.732       |
| **Baselines**  |            |             |             |
| Duplicate      | 0.287      | 0.090       | 0.048       |
| Delete         | 0.362      | 0.319       | 0.255       |
| Backtranslation| 0.233     | 0.275       | 0.223       |
| **mT0-XL supervised fine-tuning** | | | |
| [MultiParaDetox](https://huggingface.co/datasets/textdetox/multilingual_paradetox) [`s-nlp/mt0-xl-detox-mpd`](https://huggingface.co/s-nlp/mt0-xl-detox-mpd) | 0.446      | 0.344       | 0.472       |
| [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) (Subset AVG)   | 0.460      | 0.402       | 0.475       |
| [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) (this model)           | **0.482**  | **0.470**   | **0.546**   |


#### Software

Code for replicating the results from the paper can be found on [GitHub](https://github.com/s-nlp/synthdetoxm).

## Citation

**BibTeX:**

```latex
@misc{moskovskiy2025synthdetoxmmodernllmsfewshot,
      title={SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators}, 
      author={Daniil Moskovskiy and Nikita Sushko and Sergey Pletenev and Elena Tutubalina and Alexander Panchenko},
      year={2025},
      eprint={2502.06394},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.06394}, 
}
```

## License

This model is licensed under the OpenRAIL++ License, which supports the development of various technologies—both industrial and academic—that serve the public good.

## Model Card Authors [optional]

[Daniil Moskovskiy](https://huggingface.co/etomoscow)

## Model Card Contact

For any questions, please contact: [Daniil Moskovskiy]([email protected])