File size: 2,650 Bytes
8659f94 9a35a49 8659f94 9a35a49 8659f94 9a35a49 ea4fd71 c5da1b8 ea4fd71 c5da1b8 ea4fd71 c5da1b8 ea4fd71 c5da1b8 ea4fd71 c5da1b8 ea4fd71 c5da1b8 8659f94 9a35a49 8659f94 9a35a49 8659f94 1423362 5ee7d52 1423362 5ee7d52 1423362 8659f94 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
---
language:
- en
license: mit
tags:
- generated_from_trainer
datasets:
- glue
metrics:
- matthews_correlation
model-index:
- name: deberta-v3-xsmall-CoLA
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: GLUE COLA
type: glue
config: cola
split: validation
args: cola
metrics:
- name: Matthews Correlation
type: matthews_correlation
value: 0.5894856058137782
widget:
- text: 'The cat sat on the mat.'
example_title: Correct grammatical sentence
- text: 'Me and my friend going to the store.'
example_title: Incorrect subject-verb agreement
- text: 'I ain''t got no money.'
example_title: Incorrect verb conjugation and double negative
- text: 'She don''t like pizza no more.'
example_title: Incorrect verb conjugation and double negative
- text: 'They is arriving tomorrow.'
example_title: Incorrect verb conjugation
---
# deberta-v3-xsmall-CoLA
This model is a fine-tuned version of [microsoft/deberta-v3-xsmall](https://huggingface.co/microsoft/deberta-v3-xsmall) on the GLUE COLA dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4237
- Matthews Correlation: 0.5895
## Model description
Trying to find a decent optimum between accuracy/quality and inference speed.
```json
{
"epoch": 3.0,
"eval_loss": 0.423,
"eval_matthews_correlation": 0.589,
"eval_runtime": 5.0422,
"eval_samples": 1043,
"eval_samples_per_second": 206.853,
"eval_steps_per_second": 51.763
}
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 32
- eval_batch_size: 4
- seed: 16105
- distributed_type: multi-GPU
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 3.0
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Matthews Correlation |
|:-------------:|:-----:|:----:|:---------------:|:--------------------:|
| 0.3945 | 1.0 | 67 | 0.4323 | 0.5778 |
| 0.3214 | 2.0 | 134 | 0.4237 | 0.5895 |
| 0.3059 | 3.0 | 201 | 0.4636 | 0.5795 |
### Framework versions
- Transformers 4.27.0.dev0
- Pytorch 1.13.1+cu117
- Datasets 2.8.0
- Tokenizers 0.13.1
|