IAmSkyDra
/

BaViT_Base_Finetune_v0

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

IAmSkyDra commited on 5 days ago

Commit

a676265

·

verified ·

1 Parent(s): 711c24c

End of training

Files changed (2) hide show

README.md +76 -0
generation_config.json +6 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+library_name: transformers
+license: mit
+base_model: VietAI/vit5-base
+tags:
+- generated_from_trainer
+metrics:
+- sacrebleu
+model-index:
+- name: BaViT_Base_Finetune_v0
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# BaViT_Base_Finetune_v0
+This model is a fine-tuned version of [VietAI/vit5-base](https://huggingface.co/VietAI/vit5-base) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4516
+- Sacrebleu: 7.9929
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 100
+- eval_batch_size: 100
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 15
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
+|:-------------:|:-----:|:----:|:---------------:|:---------:|
+| 0.6988        | 1.0   | 468  | 0.6266          | 2.2994    |
+| 0.6014        | 2.0   | 936  | 0.5592          | 4.0780    |
+| 0.5548        | 3.0   | 1404 | 0.5231          | 4.9356    |
+| 0.5239        | 4.0   | 1872 | 0.5022          | 5.6738    |
+| 0.5063        | 5.0   | 2340 | 0.4875          | 6.2733    |
+| 0.4849        | 6.0   | 2808 | 0.4769          | 6.7126    |
+| 0.4701        | 7.0   | 3276 | 0.4705          | 6.9856    |
+| 0.4555        | 8.0   | 3744 | 0.4651          | 7.2721    |
+| 0.4524        | 9.0   | 4212 | 0.4601          | 7.5539    |
+| 0.4388        | 10.0  | 4680 | 0.4571          | 7.6076    |
+| 0.4341        | 11.0  | 5148 | 0.4549          | 7.7267    |
+| 0.4231        | 12.0  | 5616 | 0.4536          | 7.9165    |
+| 0.4174        | 13.0  | 6084 | 0.4519          | 7.9585    |
+| 0.4209        | 14.0  | 6552 | 0.4515          | 7.9864    |
+| 0.4167        | 15.0  | 7020 | 0.4516          | 7.9929    |
+### Framework versions
+- Transformers 4.48.1
+- Pytorch 2.5.1+cu124
+- Datasets 3.2.0
+- Tokenizers 0.21.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.48.1"
+}