File size: 1,672 Bytes
201d6d3
 
2ed0138
 
 
 
 
a7aef0d
b01d2aa
2ed0138
 
b01d2aa
 
201d6d3
2ed0138
 
 
 
ce07ed6
2ed0138
 
 
 
 
 
 
 
6382be9
 
2ed0138
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: cc-by-nc-sa-4.0
language:
- en
metrics:
- f1
- accuracy
widget:
- text: Girls like attention and they get desperate
tags:
- sexism
datasets:
- tum-nlp/sexism-socialmedia-balanced
---

# BERTweet for sexism detection

This is a fine-tuned BERTweet large ([BERTweet: A pre-trained language model for English Tweets](https://aclanthology.org/2020.emnlp-demos.2/)) model for detecting sexism.
The training dataset is **new balanced** version of Explainable Detection of Online Sexism ([**EDOS**](https://github.com/rewire-online/edos))--[sexism-socialmedia-balanced](https://huggingface.co/datasets/tum-nlp/sexism-socialmedia-balanced)--consisting of 16000 entries in
English gathered from social media platforms: Twitter and Gab. It achieved a **Macro-F1** score of **0.85** and an **Accuracy** of **0.88** on the test set for the EDOS task.

## How to use

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('tum-nlp/bertweet-sexism')
model = AutoModelForSequenceClassification.from_pretrained('tum-nlp/bertweet-sexism')

# Create the pipeline for classification
sexism_classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Predict
sexism_classifier("Girls like attention and they get desperate")
```

## Licensing Information

[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].

[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]

[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
[cc-by-nc-sa-image]: https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png