Initial commit
Browse files- README.md +246 -0
- config.json +95 -0
- merges.txt +0 -0
- model.safetensors +3 -0
- special_tokens_map.json +24 -0
- tokenizer.json +0 -0
- tokenizer_config.json +55 -0
- vocab.json +0 -0
README.md
CHANGED
@@ -1,3 +1,249 @@
|
|
1 |
---
|
|
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
library_name: transformers
|
3 |
license: mit
|
4 |
+
base_model: sdadas/polish-gpt2-small
|
5 |
+
tags:
|
6 |
+
- generated_from_trainer
|
7 |
+
- text-classification
|
8 |
+
- multi-class-classification
|
9 |
+
- multi-label-classification
|
10 |
+
- emotions
|
11 |
+
model-index:
|
12 |
+
- name: go-emotions-polish-gpt2-small-v0.0.1
|
13 |
+
results: []
|
14 |
+
pipeline_tag: text-classification
|
15 |
+
language:
|
16 |
+
- pl
|
17 |
---
|
18 |
+
|
19 |
+
# go-emotions-polish-gpt2-small-v0.0.1
|
20 |
+
|
21 |
+
This model is a fine-tuned version of [sdadas/polish-gpt2-small](https://huggingface.co/sdadas/polish-gpt2-small) on the machine translated [google-research-datasets/go_emotions](https://huggingface.co/datasets/google-research-datasets/go_emotions) dataset.
|
22 |
+
It achieves the following results on the evaluation set:
|
23 |
+
Every list contains results for threshold [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
24 |
+
- Loss: 0.2123
|
25 |
+
- Hamming Accuracy: [0.9556733317272946, 0.9592094848057267, 0.9614034483945348, 0.9631328079292425, 0.96438895963107, 0.9652665450665933, 0.9654558281997453]
|
26 |
+
- F1 Macro: [0.4929423141608899, 0.49513962905111936, 0.48929787051637963, 0.4797491530618914, 0.4647116651469601, 0.44570651699340547, 0.40534073938214327]
|
27 |
+
- Precision Macro: [0.45136216505815147, 0.4817190614473335, 0.5085113721947904, 0.5297915362044485, 0.5555560780209522, 0.5895901759803143, 0.6380967706137726]
|
28 |
+
- Recall Macro: [0.551217261404107, 0.5185202616828293, 0.4824209900009455, 0.4506401023339946, 0.4144524017062031, 0.3745523616734555, 0.3152333101205886]
|
29 |
+
|
30 |
+
## Model description
|
31 |
+
|
32 |
+
Trained from [sdadas/polish-gpt2-small](https://huggingface.co/sdadas/polish-gpt2-small)
|
33 |
+
|
34 |
+
## Intended uses & limitations
|
35 |
+
|
36 |
+
Detecting emotions described in paper [2005.00547](https://arxiv.org/abs/2005.00547)
|
37 |
+
|
38 |
+
Labels:
|
39 |
+
```
|
40 |
+
0: admiration
|
41 |
+
1: amusement
|
42 |
+
2: anger
|
43 |
+
3: annoyance
|
44 |
+
4: approval
|
45 |
+
5: caring
|
46 |
+
6: confusion
|
47 |
+
7: curiosity
|
48 |
+
8: desire
|
49 |
+
9: disappointment
|
50 |
+
10: disapproval
|
51 |
+
11: disgust
|
52 |
+
12: embarrassment
|
53 |
+
13: excitement
|
54 |
+
14: fear
|
55 |
+
15: gratitude
|
56 |
+
16: grief
|
57 |
+
17: joy
|
58 |
+
18: love
|
59 |
+
19: nervousness
|
60 |
+
20: optimism
|
61 |
+
21: pride
|
62 |
+
22: realization
|
63 |
+
23: relief
|
64 |
+
24: remorse
|
65 |
+
25: sadness
|
66 |
+
26: surprise
|
67 |
+
27: neutral
|
68 |
+
```
|
69 |
+
|
70 |
+
### How to use:
|
71 |
+
|
72 |
+
```py
|
73 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
74 |
+
import torch
|
75 |
+
|
76 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
77 |
+
checkpoint = "nie3e/go-emotions-polish-gpt2-small-v0.0.1"
|
78 |
+
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
79 |
+
model = AutoModelForSequenceClassification.from_pretrained(
|
80 |
+
checkpoint, problem_type="multi_label_classification"
|
81 |
+
)
|
82 |
+
|
83 |
+
text = "To jest model wykrywający super emocje w tekście! :D"
|
84 |
+
|
85 |
+
input_ids = tokenizer(text, return_tensors="pt").to(device)
|
86 |
+
logits = model(**input_ids)["logits"].to("cpu")
|
87 |
+
|
88 |
+
threshold = 0.3
|
89 |
+
predicted_class_ids = torch.arange(
|
90 |
+
0, logits.shape[-1]
|
91 |
+
)[torch.sigmoid(logits).squeeze(dim=0) > threshold]
|
92 |
+
|
93 |
+
percent = torch.sigmoid(logits).squeeze(dim=0)
|
94 |
+
|
95 |
+
id2class = model.config.id2label
|
96 |
+
print([id2class[c] for c in predicted_class_ids.tolist()])
|
97 |
+
print({id2class[i]: f"{(p*100):.2f}%" for i, p in enumerate(percent.tolist())})
|
98 |
+
```
|
99 |
+
```
|
100 |
+
['joy']
|
101 |
+
{'admiration': '17.75%', 'amusement': '11.22%', 'anger': '0.07%', 'annoyance': '0.36%', 'approval': '6.63%', 'caring': '0.84%', 'confusion': '0.22%', 'curiosity': '0.58%', 'desire': '0.40%', 'disappointment': '1.29%', 'disapproval': '0.26%', 'disgust': '0.11%', 'embarrassment': '0.08%', 'excitement': '25.88%', 'fear': '0.54%', 'gratitude': '0.41%', 'grief': '0.90%', 'joy': '63.62%', 'love': '11.74%', 'nervousness': '0.08%', 'optimism': '1.98%', 'pride': '0.03%', 'realization': '1.19%', 'relief': '0.53%', 'remorse': '0.02%', 'sadness': '0.75%', 'surprise': '0.58%', 'neutral': '8.93%'}
|
102 |
+
```
|
103 |
+
|
104 |
+
or using pipeline:
|
105 |
+
```py
|
106 |
+
from transformers import pipeline
|
107 |
+
|
108 |
+
# Set the model, tokenizer, device, text as above
|
109 |
+
|
110 |
+
pipe = pipeline(
|
111 |
+
task="text-classification",
|
112 |
+
model=model,
|
113 |
+
tokenizer=tokenizer,
|
114 |
+
top_k=-1,
|
115 |
+
device=device
|
116 |
+
)
|
117 |
+
|
118 |
+
result = pipe(text)
|
119 |
+
print(result)
|
120 |
+
```
|
121 |
+
|
122 |
+
```
|
123 |
+
[[{'label': 'joy', 'score': 0.6362035274505615}, {'label': 'excitement', 'score': 0.2588024437427521}, {'label': 'admiration', 'score': 0.17747776210308075}, {'label': 'love', 'score': 0.11739460378885269}, {'label': 'amusement', 'score': 0.11221607774496078}, {'label': 'neutral', 'score': 0.08927429467439651}, {'label': 'approval', 'score': 0.0662560984492302}, {'label': 'optimism', 'score': 0.019809801131486893}, {'label': 'disappointment', 'score': 0.012886008247733116}, {'label': 'realization', 'score': 0.011940046213567257}, {'label': 'grief', 'score': 0.009018097072839737}, {'label': 'caring', 'score': 0.008446046151220798}, {'label': 'sadness', 'score': 0.007472767494618893}, {'label': 'curiosity', 'score': 0.0058141243644058704}, {'label': 'surprise', 'score': 0.005764781963080168}, {'label': 'fear', 'score': 0.00539048807695508}, {'label': 'relief', 'score': 0.005273739341646433}, {'label': 'gratitude', 'score': 0.004061913583427668}, {'label': 'desire', 'score': 0.003967347089201212}, {'label': 'annoyance', 'score': 0.0036265170201659203}, {'label': 'disapproval', 'score': 0.0026028596330434084}, {'label': 'confusion', 'score': 0.0022179142106324434}, {'label': 'disgust', 'score': 0.0011114622466266155}, {'label': 'embarrassment', 'score': 0.0007856030715629458}, {'label': 'nervousness', 'score': 0.0007625268190167844}, {'label': 'anger', 'score': 0.0007304779137484729}, {'label': 'pride', 'score': 0.0003317077935207635}]]
|
124 |
+
```
|
125 |
+
|
126 |
+
## Training and evaluation data
|
127 |
+
|
128 |
+
Dataset: [google-research-datasets/go_emotions](https://huggingface.co/datasets/google-research-datasets/go_emotions)
|
129 |
+
Preprocessing:
|
130 |
+
- dropping rows that exceeded 256 tokens (GPT2 Tokenizer) and having 5 or more digits in a row in the text
|
131 |
+
- machine translation into Polish
|
132 |
+
- removing 80% rows that have only `neutral` label
|
133 |
+
|
134 |
+
## Training procedure
|
135 |
+
|
136 |
+
### Training hyperparameters
|
137 |
+
|
138 |
+
The following hyperparameters were used during training:
|
139 |
+
- learning_rate: 2e-05
|
140 |
+
- train_batch_size: 8
|
141 |
+
- eval_batch_size: 2
|
142 |
+
- seed: 42
|
143 |
+
- gradient_accumulation_steps: 8
|
144 |
+
- total_train_batch_size: 64
|
145 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
146 |
+
- lr_scheduler_type: linear
|
147 |
+
- lr_scheduler_warmup_ratio: 0.1
|
148 |
+
- num_epochs: 10
|
149 |
+
|
150 |
+
<details><summary>Trainer using class weights: </summary>
|
151 |
+
|
152 |
+
```py
|
153 |
+
class WeightedTrainer(Trainer):
|
154 |
+
def __init__(self, class_weights=None, **kwargs):
|
155 |
+
super().__init__(**kwargs)
|
156 |
+
self.class_weights = class_weights
|
157 |
+
|
158 |
+
def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
|
159 |
+
labels = inputs.pop("labels")
|
160 |
+
outputs = model(**inputs)
|
161 |
+
logits = outputs.logits
|
162 |
+
|
163 |
+
loss_fct = torch.nn.BCEWithLogitsLoss(pos_weight=self.class_weights)
|
164 |
+
loss = loss_fct(logits, labels.float())
|
165 |
+
|
166 |
+
return (loss, outputs) if return_outputs else loss
|
167 |
+
```
|
168 |
+
</details>
|
169 |
+
|
170 |
+
<details><summary>Class weights: </summary>
|
171 |
+
|
172 |
+
```py
|
173 |
+
df = tokenized_dataset["train"].to_pandas()
|
174 |
+
label_counts = df["labels_h1"].explode().value_counts().sort_index()
|
175 |
+
total_samples = len(df)
|
176 |
+
class_weights = [(total_samples - count) / count for count in label_counts]
|
177 |
+
class_weights = np.clip(class_weights, 0.1, 10.0)
|
178 |
+
```
|
179 |
+
</details>
|
180 |
+
|
181 |
+
<details><summary>Metrics computation: </summary>
|
182 |
+
|
183 |
+
```py
|
184 |
+
import numpy as np
|
185 |
+
from sklearn.metrics import f1_score, precision_score, recall_score, accuracy_score
|
186 |
+
|
187 |
+
def sigmoid(x):
|
188 |
+
return 1/(1 + np.exp(-x))
|
189 |
+
|
190 |
+
|
191 |
+
def compute_metrics(eval_pred):
|
192 |
+
predictions, labels = eval_pred
|
193 |
+
probabilities = sigmoid(predictions)
|
194 |
+
thresholds = np.arange(0.3, 0.91, 0.1)
|
195 |
+
|
196 |
+
computed_metrics = {
|
197 |
+
"hamming_accuracy": [],
|
198 |
+
"f1_macro": [],
|
199 |
+
"precision_macro": [],
|
200 |
+
"recall_macro": []
|
201 |
+
}
|
202 |
+
|
203 |
+
for th in thresholds:
|
204 |
+
binary_preds = (probabilities > th).astype(int)
|
205 |
+
|
206 |
+
# Hamming Accuracy (for multi-label)
|
207 |
+
hamming_acc = 1 - np.mean(binary_preds != labels)
|
208 |
+
|
209 |
+
# Macro-averaged F1/Precision/Recall
|
210 |
+
f1 = f1_score(labels, binary_preds, average="macro", zero_division=0)
|
211 |
+
precision = precision_score(labels, binary_preds, average="macro",
|
212 |
+
zero_division=0)
|
213 |
+
recall = recall_score(labels, binary_preds, average="macro",
|
214 |
+
zero_division=0)
|
215 |
+
|
216 |
+
computed_metrics["hamming_accuracy"].append(hamming_acc)
|
217 |
+
computed_metrics["f1_macro"].append(f1)
|
218 |
+
computed_metrics["precision_macro"].append(precision)
|
219 |
+
computed_metrics["recall_macro"].append(recall)
|
220 |
+
|
221 |
+
return computed_metrics
|
222 |
+
```
|
223 |
+
</details>
|
224 |
+
|
225 |
+
### Training results
|
226 |
+
Every list contains results for threshold [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
227 |
+
<details><summary>Results:</summary>
|
228 |
+
|
229 |
+
| Epoch | Training Loss | Validation Loss | Hamming Accuracy | F1 Macro | Precision Macro | Recall Macro |
|
230 |
+
|:-----:|:-------------:|:---------------:|:--------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------:|
|
231 |
+
| 1 | 0.27 | 0.1772 | [0.9512768007708986, 0.959080428124032, 0.9628144681143959, 0.9640103933647658, 0.9644922049764256, 0.9635629968682246, 0.9613690332794164] | [0.46241863342821793, 0.4532072267866339, 0.4326439737447529, 0.3910675952043853, 0.34917671255121363, 0.2895274759086305, 0.17584749438485497] | [0.4461301409377742, 0.5054615942184383, 0.5451358075233073, 0.5833130459387715, 0.612234133178668, 0.6415091314766198, 0.5974110920754876] | [0.5228623671145524, 0.44986334764237274, 0.3906724180081668, 0.32626227826074566, 0.27496335348059725, 0.21568979328206703, 0.12657202290169853] |
|
232 |
+
| 2 | 0.1671 | 0.1609 | [0.9546752933888564, 0.9605000516226727, 0.9636834497711395, 0.9650600543758819, 0.9657311491206938, 0.9651977148363561, 0.9638813366830712] | [0.47181939799853245, 0.4773149021343449, 0.47247182216640654, 0.4451366460170004, 0.42366333538332507, 0.37403397926368737, 0.30313475002048484] | [0.441959190629794, 0.5081107432668841, 0.561701577155539, 0.5801413709101412, 0.6462220107453076, 0.6764110777613557, 0.7061705442835556] | [0.580901954973524, 0.5214683734393482, 0.47275669773690804, 0.40899548004178854, 0.36579701720117563, 0.3034817689033619, 0.23056933111221325] |
|
233 |
+
| 3 | 0.1399 | 0.1606 | [0.9497109130330041, 0.9569208796503424, 0.9613518257218571, 0.9638211102316138, 0.965249337509034, 0.9654988470936435, 0.9639931858072065] | [0.4749602125981087, 0.4873423290936483, 0.48338025379077687, 0.4701566587566043, 0.4498896201717952, 0.41177613984294953, 0.32623322721112374] | [0.41117966940106204, 0.45764702886929787, 0.5046222714256859, 0.5530253088371339, 0.6005644340448025, 0.6760814655629932, 0.6800566831000542] | [0.6222726145233521, 0.5734770801943697, 0.5113684601796389, 0.4550331179550861, 0.4044620988185089, 0.34071151157255797, 0.25092165205453] |
|
234 |
+
| 4 | 0.1176 | 0.1669 | [0.95028736621124, 0.9570241249956981, 0.9612141652613828, 0.9638555253467322, 0.9651202808273394, 0.9657827717933717, 0.9645008087552053] | [0.46776823075218205, 0.4770370632194352, 0.48170399873205805, 0.4815610729662237, 0.46635824100838313, 0.4346021274443605, 0.35739954616385866] | [0.4026523166950933, 0.44682680882818887, 0.4916683449781562, 0.5367728364431649, 0.575993305278815, 0.6472053281375478, 0.6693600674340754] | [0.6123919277944883, 0.557746427766825, 0.5176178038314311, 0.47937515940889003, 0.4322437215959729, 0.36783678520926777, 0.27372379016591925] |
|
235 |
+
| 5 | 0.0992 | 0.1735 | [0.9527996696148948, 0.9587878996455244, 0.9617906184396187, 0.9635802044257838, 0.9652063186151357, 0.9660236775992016, 0.9651891110575765] | [0.48529687411311123, 0.49524580727616085, 0.4889009921348519, 0.476286545728211, 0.4606138233621902, 0.4360905909528836, 0.37325003123769596] | [0.43090457512637126, 0.4758805613570571, 0.5059212751069277, 0.5323418850235423, 0.5778588494150572, 0.6303798411419806, 0.7050763694911789] | [0.5871641620419744, 0.5468805735159331, 0.5029133763134643, 0.46169826597462366, 0.4141503796261941, 0.3657398096031768, 0.28558429402703295] |
|
236 |
+
| 6 | 0.083 | 0.1841 | [0.9520941597549644, 0.9572736345803077, 0.9603795987197578, 0.9626337887600234, 0.9642771105069347, 0.9654042055270675, 0.9656279037753381] | [0.4848110691063849, 0.4902428690172167, 0.48618324094573484, 0.478250083701593, 0.4677017272145706, 0.44389902987164165, 0.39728758295841066] | [0.4282820729807268, 0.4617066856103255, 0.48889885714082243, 0.515822475491864, 0.548112314645949, 0.5864861700775268, 0.642991780443598] | [0.5872655526119982, 0.5489591883468488, 0.5083264155214335, 0.47033662001581716, 0.43490169281095825, 0.384219870731881, 0.3108363940394387] |
|
237 |
+
| 7 | 0.0715 | 0.1922 | [0.9541934817771965, 0.958366314485322, 0.960921636782875, 0.9629607323536498, 0.964363148294731, 0.9655160546512028, 0.9651460921636783] | [0.49327999654639215, 0.4942623403104429, 0.48696556713303135, 0.4804445324053421, 0.46939295694544736, 0.4475119371916684, 0.3881489924589002] | [0.4442084347600441, 0.47508959657212724, 0.5001365624277175, 0.5315837721216776, 0.5635172346685284, 0.6060793900700022, 0.6547980634968182] | [0.5749401619525247, 0.5335821079528872, 0.4937842962870135, 0.45744546851621865, 0.4230868747861999, 0.37702257236227427, 0.29712029562165154] |
|
238 |
+
| 8 | 0.0609 | 0.2025 | [0.9547011047251953, 0.9585469938396944, 0.961377637058196, 0.9629951474687682, 0.9646040541005609, 0.9654644319785249, 0.9653095639604914] | [0.4929084055191263, 0.4966642184262872, 0.49509852255405307, 0.47826643819598524, 0.4664438311012158, 0.4433651185391054, 0.38896780252101004] | [0.45156391170754395, 0.48340644816595113, 0.5119596021038765, 0.5318185543351281, 0.5669432431065436, 0.6023885415673448, 0.640319754629273] | [0.5590029919029157, 0.5266729465629012, 0.49473318351169737, 0.45252148670174985, 0.4154957108440041, 0.37123594052789305, 0.2964469621468265] |
|
239 |
+
| 9 | 0.0524 | 0.2099 | [0.9549592180885845, 0.9585986165123722, 0.9611195236948068, 0.9629435247960905, 0.9642771105069347, 0.9651030732697801, 0.9651805072787969] | [0.4920579362491578, 0.48948143585573084, 0.48373280918321976, 0.4765803308742461, 0.464925139967501, 0.44204098321531043, 0.3994121381701787] | [0.4481765728464743, 0.4727514944675648, 0.49823474126036366, 0.5251829103863094, 0.5535803816229916, 0.5863930625014495, 0.6280255957220133] | [0.5567520239062707, 0.5174123200187039, 0.4811964159610151, 0.4494146051029743, 0.41590442650848564, 0.37120008498454643, 0.30929707478684687] |
|
240 |
+
| 10 | 0.0479 | 0.2123 | [0.9556733317272946, 0.9592094848057267, 0.9614034483945348, 0.9631328079292425, 0.96438895963107, 0.9652665450665933, 0.9654558281997453] | [0.4929423141608899, 0.49513962905111936, 0.48929787051637963, 0.4797491530618914, 0.4647116651469601, 0.44570651699340547, 0.40534073938214327] | [0.45136216505815147, 0.4817190614473335, 0.5085113721947904, 0.5297915362044485, 0.5555560780209522, 0.5895901759803143, 0.6380967706137726] | [0.551217261404107, 0.5185202616828293, 0.4824209900009455, 0.4506401023339946, 0.4144524017062031, 0.3745523616734555, 0.3152333101205886] |
|
241 |
+
</details>
|
242 |
+
|
243 |
+
### Framework versions
|
244 |
+
|
245 |
+
- Transformers 4.48.3
|
246 |
+
- Pytorch 2.5.1+cu124
|
247 |
+
- Datasets 3.2.0
|
248 |
+
- Tokenizers 0.21.0
|
249 |
+
|
config.json
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "sdadas/polish-gpt2-small",
|
3 |
+
"activation_function": "gelu_fast",
|
4 |
+
"architectures": [
|
5 |
+
"GPT2ForSequenceClassification"
|
6 |
+
],
|
7 |
+
"attn_pdrop": 0.1,
|
8 |
+
"bos_token_id": 0,
|
9 |
+
"embd_pdrop": 0.1,
|
10 |
+
"eos_token_id": 2,
|
11 |
+
"id2label": {
|
12 |
+
"0": "admiration",
|
13 |
+
"1": "amusement",
|
14 |
+
"2": "anger",
|
15 |
+
"3": "annoyance",
|
16 |
+
"4": "approval",
|
17 |
+
"5": "caring",
|
18 |
+
"6": "confusion",
|
19 |
+
"7": "curiosity",
|
20 |
+
"8": "desire",
|
21 |
+
"9": "disappointment",
|
22 |
+
"10": "disapproval",
|
23 |
+
"11": "disgust",
|
24 |
+
"12": "embarrassment",
|
25 |
+
"13": "excitement",
|
26 |
+
"14": "fear",
|
27 |
+
"15": "gratitude",
|
28 |
+
"16": "grief",
|
29 |
+
"17": "joy",
|
30 |
+
"18": "love",
|
31 |
+
"19": "nervousness",
|
32 |
+
"20": "optimism",
|
33 |
+
"21": "pride",
|
34 |
+
"22": "realization",
|
35 |
+
"23": "relief",
|
36 |
+
"24": "remorse",
|
37 |
+
"25": "sadness",
|
38 |
+
"26": "surprise",
|
39 |
+
"27": "neutral"
|
40 |
+
},
|
41 |
+
"initializer_range": 0.02,
|
42 |
+
"label2id": {
|
43 |
+
"admiration": 0,
|
44 |
+
"amusement": 1,
|
45 |
+
"anger": 2,
|
46 |
+
"annoyance": 3,
|
47 |
+
"approval": 4,
|
48 |
+
"caring": 5,
|
49 |
+
"confusion": 6,
|
50 |
+
"curiosity": 7,
|
51 |
+
"desire": 8,
|
52 |
+
"disappointment": 9,
|
53 |
+
"disapproval": 10,
|
54 |
+
"disgust": 11,
|
55 |
+
"embarrassment": 12,
|
56 |
+
"excitement": 13,
|
57 |
+
"fear": 14,
|
58 |
+
"gratitude": 15,
|
59 |
+
"grief": 16,
|
60 |
+
"joy": 17,
|
61 |
+
"love": 18,
|
62 |
+
"nervousness": 19,
|
63 |
+
"neutral": 27,
|
64 |
+
"optimism": 20,
|
65 |
+
"pride": 21,
|
66 |
+
"realization": 22,
|
67 |
+
"relief": 23,
|
68 |
+
"remorse": 24,
|
69 |
+
"sadness": 25,
|
70 |
+
"surprise": 26
|
71 |
+
},
|
72 |
+
"layer_norm_epsilon": 1e-05,
|
73 |
+
"model_type": "gpt2",
|
74 |
+
"n_embd": 768,
|
75 |
+
"n_head": 12,
|
76 |
+
"n_inner": 3072,
|
77 |
+
"n_layer": 12,
|
78 |
+
"n_positions": 2048,
|
79 |
+
"pad_token_id": 2,
|
80 |
+
"problem_type": "multi_label_classification",
|
81 |
+
"reorder_and_upcast_attn": false,
|
82 |
+
"resid_pdrop": 0.1,
|
83 |
+
"scale_attn_by_inverse_layer_idx": false,
|
84 |
+
"scale_attn_weights": true,
|
85 |
+
"summary_activation": null,
|
86 |
+
"summary_first_dropout": 0.1,
|
87 |
+
"summary_proj_to_labels": true,
|
88 |
+
"summary_type": "cls_index",
|
89 |
+
"summary_use_proj": true,
|
90 |
+
"tokenizer_class": "GPT2TokenizerFast",
|
91 |
+
"torch_dtype": "float32",
|
92 |
+
"transformers_version": "4.48.3",
|
93 |
+
"use_cache": true,
|
94 |
+
"vocab_size": 51200
|
95 |
+
}
|
merges.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d512dc8a214a135e451ad35dac68e31342b2ea7ac7da5d8329f26d5e7b632072
|
3 |
+
size 503902928
|
special_tokens_map.json
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "<s>",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": false,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"eos_token": {
|
10 |
+
"content": "</s>",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"pad_token": "</s>",
|
17 |
+
"unk_token": {
|
18 |
+
"content": "<unk>",
|
19 |
+
"lstrip": false,
|
20 |
+
"normalized": false,
|
21 |
+
"rstrip": false,
|
22 |
+
"single_word": false
|
23 |
+
}
|
24 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_bos_token": false,
|
3 |
+
"add_prefix_space": false,
|
4 |
+
"added_tokens_decoder": {
|
5 |
+
"0": {
|
6 |
+
"content": "<s>",
|
7 |
+
"lstrip": false,
|
8 |
+
"normalized": false,
|
9 |
+
"rstrip": false,
|
10 |
+
"single_word": false,
|
11 |
+
"special": true
|
12 |
+
},
|
13 |
+
"1": {
|
14 |
+
"content": "<pad>",
|
15 |
+
"lstrip": false,
|
16 |
+
"normalized": false,
|
17 |
+
"rstrip": false,
|
18 |
+
"single_word": false,
|
19 |
+
"special": true
|
20 |
+
},
|
21 |
+
"2": {
|
22 |
+
"content": "</s>",
|
23 |
+
"lstrip": false,
|
24 |
+
"normalized": false,
|
25 |
+
"rstrip": false,
|
26 |
+
"single_word": false,
|
27 |
+
"special": true
|
28 |
+
},
|
29 |
+
"3": {
|
30 |
+
"content": "<unk>",
|
31 |
+
"lstrip": false,
|
32 |
+
"normalized": false,
|
33 |
+
"rstrip": false,
|
34 |
+
"single_word": false,
|
35 |
+
"special": true
|
36 |
+
},
|
37 |
+
"4": {
|
38 |
+
"content": "<mask>",
|
39 |
+
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
+
"rstrip": false,
|
42 |
+
"single_word": false,
|
43 |
+
"special": true
|
44 |
+
}
|
45 |
+
},
|
46 |
+
"bos_token": "<s>",
|
47 |
+
"clean_up_tokenization_spaces": false,
|
48 |
+
"eos_token": "</s>",
|
49 |
+
"errors": "replace",
|
50 |
+
"extra_special_tokens": {},
|
51 |
+
"model_max_length": 1000000000000000019884624838656,
|
52 |
+
"pad_token": "</s>",
|
53 |
+
"tokenizer_class": "GPT2Tokenizer",
|
54 |
+
"unk_token": "<unk>"
|
55 |
+
}
|
vocab.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|