---
license: apache-2.0
datasets:
- allenai/real-toxicity-prompts
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- toxic_classification
---
# SegmentCNN Model for Toxic Text Classification

## Overview

The SegmentCNN model, a.k.a, SpanCNN, is designed for toxic text classification, distinguishing between safe and toxic content. This model is part of the research presented in the paper titled [CMD: A Framework for Context-aware Model Self-Detoxification](https://arxiv.org/abs/2308.08295).

## Model Details

- **Input**: Text data
- **Output**: Integer
  - `0` represents **safe** content
  - `1` represents **toxic** content

## Usage

To use the SegmentCNN model for toxic text classification, follow the example below:

```python
from transformers import pipeline

# Load the SpanCNN model
classifier = pipeline("spancnn-classification", model="ZetangForward/SegmentCNN", trust_remote_code=True)

# Example 1: Safe text
pos_text = "You look good today~!"
result = classifier(pos_text)
print(result)  # Output: 0 (safe)

# Example 2: Toxic text
neg_text = "You're too stupid, you're just like a fool"
result = classifier(neg_text)
print(result)  # Output: 1 (toxic)
```

## Citation

If you find this model useful, please consider citing the original paper:

```bibtex
@article{tang2023detoxify,
  title={Detoxify language model step-by-step},
  author={Tang, Zecheng and Zhou, Keyan and Wang, Pinzheng and Ding, Yuyang and Li, Juntao and others},
  journal={arXiv preprint arXiv:2308.08295},
  year={2023}
}
```

## Disclaimer

While the SegmentCNN model is effective in detecting toxic segments within text, we strongly recommend that users carefully review the results and exercise caution when applying this method in real-world scenarios. The model is not infallible, and its outputs should be validated in context-sensitive applications.