horychtom commited on
Commit
5ae4d3e
·
verified ·
1 Parent(s): 3b2aede

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -80
README.md CHANGED
@@ -9,101 +9,52 @@ base_model:
9
  pipeline_tag: text-classification
10
  ---
11
 
 
12
 
13
- This model is a version of [mediabiasgrouup/magpie-pt-xlm](https://huggingface.co/mediabiasgroup/magpie-pt-xlm), fine-tuned on BABE dataset for sentence-level media bias binary classification.
14
- The model is based on [model architecture, e.g., BERT, RoBERTa, etc.] and has been fine-tuned on [mention the dataset] for [number of epochs or other training details].
15
-
16
 
17
  ---
18
 
19
  ## Training details
20
 
21
- - **Base Model:** FacebookAI/xlm-roberta-base
22
  - **Number of Parameters:** 279M
23
  - **Max Sequence Length:** 128
24
 
25
- ### Training Data
26
-
27
- The model was fine-tuned on the [name of dataset] dataset. This dataset consists of [short description of dataset, e.g., number of instances, labels, any important data characteristics].
28
-
29
- You can find the dataset [here](dataset_url).
30
-
31
- ---
32
-
33
- ## Evaluation Results
34
-
35
- The model was evaluated on [mediabiasgroup/BABE] test set and achieved the following results:
36
-
37
- - **Accuracy:** [accuracy score]
38
- - **F1-Score:** [F1 score]
39
- - **Precision:** [precision score]
40
- - **Recall:** [recall score]
41
-
42
- For detailed evaluation results, see the corresponding paper.
43
-
44
- ---
45
-
46
- ## Usage
47
-
48
-
49
- ```python
50
- from transformers import AutoModelForSequenceClassification, AutoTokenizer
51
-
52
- tokenizer = AutoTokenizer.from_pretrained("mediabiasgroup/magpie-babe-ft-xlm")
53
- model = AutoModelForSequenceClassification.from_pretrained("mediabiasgroup/magpie-babe-ft-xlm")
54
-
55
- # Example input
56
- input_text = "Your example sentence goes here."
57
- inputs = tokenizer(input_text, return_tensors="pt")
58
- outputs = model(**inputs)
59
-
60
- # Accessing the predicted class
61
- predicted_class = outputs.logits.argmax(dim=-1)
62
- print(f"Predicted class: {predicted_class}")
63
- ```
64
-
65
- ---
66
-
67
- ## Example Code
68
-
69
- Here’s an example for batch classification:
70
-
71
- ```python
72
- import torch
73
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
74
-
75
- tokenizer = AutoTokenizer.from_pretrained("your_org/your_model")
76
- model = AutoModelForSequenceClassification.from_pretrained("your_org/your_model")
77
-
78
- # Example sentences
79
- sentences = ["Sentence 1", "Sentence 2", "Sentence 3"]
80
- inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
81
-
82
- with torch.no_grad():
83
- outputs = model(**inputs)
84
-
85
- predicted_classes = outputs.logits.argmax(dim=-1)
86
- print(f"Predicted classes: {predicted_classes}")
87
- ```
88
-
89
  ---
90
 
91
  ## Citation
92
 
93
  The code for the training is available at: https://github.com/Media-Bias-Group/magpie-multi-task
 
94
  If you use this model, please cite the following paper(s):
95
 
96
  ```bibtex
97
- @article{your_citation,
98
- title={Your Title},
99
- author={Your Name and Co-authors},
100
- journal={Journal Name},
101
- year={Year},
102
- publisher={Publisher},
103
- url={paper_url}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  }
105
- ```
106
-
107
- ---
108
-
109
- Feel free to adapt this template to match the specific needs of each model. Let me know if you'd like to adjust any sections further!
 
9
  pipeline_tag: text-classification
10
  ---
11
 
12
+ This model is a multilingual sentence-level media bias classifier.
13
 
14
+ It is a version of [mediabiasgrouup/magpie-pt-xlm](https://huggingface.co/mediabiasgroup/magpie-pt-xlm), fine-tuned for a media bias classification.
15
+ It has been pre-trained on LBM (Large Bias Mixture) collection of 59 tasks and then fine-tuned on the [mediabiasgrouup/BABE](https://huggingface.co/mediabiasgroup/BABE) dataset.
 
16
 
17
  ---
18
 
19
  ## Training details
20
 
21
+ - **Base Model:** mediabiasgroup/magpie-pt-xlm
22
  - **Number of Parameters:** 279M
23
  - **Max Sequence Length:** 128
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ---
26
 
27
  ## Citation
28
 
29
  The code for the training is available at: https://github.com/Media-Bias-Group/magpie-multi-task
30
+ The paper is avalable at: https://aclanthology.org/2024.lrec-main.952/
31
  If you use this model, please cite the following paper(s):
32
 
33
  ```bibtex
34
+ @inproceedings{horych-etal-2024-magpie,
35
+ title = "{MAGPIE}: Multi-Task Analysis of Media-Bias Generalization with Pre-Trained Identification of Expressions",
36
+ author = "Horych, Tom{\'a}{\v{s}} and
37
+ Wessel, Martin Paul and
38
+ Wahle, Jan Philip and
39
+ Ruas, Terry and
40
+ Wa{\ss}muth, Jerome and
41
+ Greiner-Petter, Andr{\'e} and
42
+ Aizawa, Akiko and
43
+ Gipp, Bela and
44
+ Spinde, Timo",
45
+ editor = "Calzolari, Nicoletta and
46
+ Kan, Min-Yen and
47
+ Hoste, Veronique and
48
+ Lenci, Alessandro and
49
+ Sakti, Sakriani and
50
+ Xue, Nianwen",
51
+ booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
52
+ month = may,
53
+ year = "2024",
54
+ address = "Torino, Italia",
55
+ publisher = "ELRA and ICCL",
56
+ url = "https://aclanthology.org/2024.lrec-main.952",
57
+ pages = "10903--10920",
58
+ abstract = "Media bias detection poses a complex, multifaceted problem traditionally tackled using single-task models and small in-domain datasets, consequently lacking generalizability. To address this, we introduce MAGPIE, a large-scale multi-task pre-training approach explicitly tailored for media bias detection. To enable large-scale pre-training, we construct Large Bias Mixture (LBM), a compilation of 59 bias-related tasks. MAGPIE outperforms previous approaches in media bias detection on the Bias Annotation By Experts (BABE) dataset, with a relative improvement of 3.3{\%} F1-score. Furthermore, using a RoBERTa encoder, we show that MAGPIE needs only 15{\%} of fine-tuning steps compared to single-task approaches. We provide insight into task learning interference and show that sentiment analysis and emotion detection help learning of all other tasks, and scaling the number of tasks leads to the best results. MAGPIE confirms that MTL is a promising approach for addressing media bias detection, enhancing the accuracy and efficiency of existing models. Furthermore, LBM is the first available resource collection focused on media bias MTL.",
59
  }
60
+ ```