File size: 986 Bytes
295cc89
e4d5348
295cc89
 
 
 
 
 
e4d5348
 
295cc89
 
 
 
 
 
e4d5348
295cc89
 
e4d5348
 
295cc89
e4d5348
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
---
library_name: transformers
tags: []
---

# Model Card for Model ID

The model was created for the task of tweet classification with 3 classes: positive tweet, neutral or negative. 
It is an adapter for the default TinyLlama/TinyLlama-1.1B-Chat-v1.0 and unfortunately it degrades  f1-score on the task from 0.18 to 0.12. 


## Training Details

### Training Data

The model was trained on cardiffnlp/tweet_eval that was created exactly for the given task -- tweet classification.

### Training Procedure
The model was trained with trl SFTTrainer for ~260 iterations. LR was cosine with start 5e-5, effective batch size was 32. 
The rank for the matricies was standard 8, alpha -- 16. AdamW was used as the optimizer.LoRA layers were adapted for v_proj and k_proj layers. Quant type was normal float4, type for the computations was bfloat16

## Results
f1 score was degraded from 0.18 to 0.12. I have should to try another hyperparameters and use more epochs for training, I guess