Nishant24/mbart-finetuned-hi-to-en_Siddha_Yoga_Text_by_Nishant

Browse files

Files changed (4) hide show

README.md +142 -40
config.json +1 -1
generation_config.json +1 -1
training_args.bin +3 -0

README.md CHANGED Viewed

@@ -1,53 +1,155 @@
 ---
-license: apache-2.0
-language:
-- en
-- hi
-metrics:
-- bleu
-library_name: transformers
-pipeline_tag: translation
 tags:
-- code
 ---
-This model is a fine-tuned checkpoint of mbart-large-50-many-to-many-mmt is fine-tuned for Siddha Yoga Hindi to English translation. It was introduced in Multilingual Translation with Extensible Multilingual Pretraining and Finetuning paper : https://arxiv.org/pdf/2008.00401.pdf
-The model can translate directly between any pair of 50 languages. To translate into a target language, the target language id is forced as the first generated token. To force the target language id as the first generated token, pass the forced_bos_token_id parameter to the generate method.
-This model was fine tuned as part of Dissertation project of Mtech in Data Science at BITS PILANI by Nishant Chhetri.
-Code to Use the model for Inference :
-```python
-from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
-article_hi = "इससे पूर्व हमने 'साधन योग' का वर्णन किया था और उसमें यह समझाया था कि साधना करने वाला हर एक व्यक्ति बिना किसी प्रकार की दीनता के योग के साधनों द्वारा अपने परम लक्ष्यों को प्राप्त कर सकता है।"
-article_eng = "Before this I described “Sadhan Yoga” and explained that any sadhak can reach their highest goals through practices of Yoga without any meekness."
-model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
-tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
-# translate Hindi to English
-tokenizer.src_lang = "hi_IN"
-encoded_hi = tokenizer(article_hi, return_tensors="pt")
-generated_tokens = model.generate(
-    **encoded_hi,
-    forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"]
-)
-tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
-# => "Before this I described “Sadhan Yoga” and explained that any sadhak can reach their highest goals through practices of Yoga without any meekness."
-```
-## BibTeX entry and citation info
-```
-@article{tang2020multilingual,
-    title={Multilingual Translation with Extensible Multilingual Pretraining and Finetuning},
-    author={Yuqing Tang and Chau Tran and Xian Li and Peng-Jen Chen and Naman Goyal and Vishrav Chaudhary and Jiatao Gu and Angela Fan},
-    year={2020},
-    eprint={2008.00401},
-    archivePrefix={arXiv},
-    primaryClass={cs.CL}
-}
-```

 ---
+base_model: facebook/mbart-large-50-many-to-many-mmt
 tags:
+- generated_from_trainer
+model-index:
+- name: mbart-finetuned-hi-to-en_Siddha_Yoga_Text_by_Nishant
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mbart-finetuned-hi-to-en_Siddha_Yoga_Text_by_Nishant
+This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0001
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 100
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 2    | 9.7387          |
+| No log        | 2.0   | 4    | 7.5348          |
+| No log        | 3.0   | 6    | 3.8893          |
+| No log        | 4.0   | 8    | 1.6238          |
+| No log        | 5.0   | 10   | 0.6203          |
+| No log        | 6.0   | 12   | 0.5660          |
+| No log        | 7.0   | 14   | 0.2931          |
+| No log        | 8.0   | 16   | 0.3130          |
+| No log        | 9.0   | 18   | 0.1872          |
+| No log        | 10.0  | 20   | 0.2907          |
+| No log        | 11.0  | 22   | 0.0909          |
+| No log        | 12.0  | 24   | 0.1210          |
+| No log        | 13.0  | 26   | 0.1342          |
+| No log        | 14.0  | 28   | 0.0887          |
+| No log        | 15.0  | 30   | 0.0744          |
+| No log        | 16.0  | 32   | 0.2141          |
+| No log        | 17.0  | 34   | 0.0569          |
+| No log        | 18.0  | 36   | 0.0495          |
+| No log        | 19.0  | 38   | 0.0500          |
+| No log        | 20.0  | 40   | 0.0419          |
+| No log        | 21.0  | 42   | 0.5107          |
+| No log        | 22.0  | 44   | 0.0366          |
+| No log        | 23.0  | 46   | 0.0525          |
+| No log        | 24.0  | 48   | 0.0429          |
+| No log        | 25.0  | 50   | 0.1173          |
+| No log        | 26.0  | 52   | 0.1467          |
+| No log        | 27.0  | 54   | 0.1957          |
+| No log        | 28.0  | 56   | 0.1132          |
+| No log        | 29.0  | 58   | 0.1369          |
+| No log        | 30.0  | 60   | 0.3323          |
+| No log        | 31.0  | 62   | 0.1180          |
+| No log        | 32.0  | 64   | 0.0698          |
+| No log        | 33.0  | 66   | 0.0392          |
+| No log        | 34.0  | 68   | 0.0306          |
+| No log        | 35.0  | 70   | 0.0389          |
+| No log        | 36.0  | 72   | 0.0297          |
+| No log        | 37.0  | 74   | 0.0304          |
+| No log        | 38.0  | 76   | 0.0304          |
+| No log        | 39.0  | 78   | 0.1154          |
+| No log        | 40.0  | 80   | 0.0285          |
+| No log        | 41.0  | 82   | 0.0249          |
+| No log        | 42.0  | 84   | 0.0305          |
+| No log        | 43.0  | 86   | 0.0302          |
+| No log        | 44.0  | 88   | 0.0263          |
+| No log        | 45.0  | 90   | 0.0244          |
+| No log        | 46.0  | 92   | 0.0257          |
+| No log        | 47.0  | 94   | 0.0212          |
+| No log        | 48.0  | 96   | 0.0291          |
+| No log        | 49.0  | 98   | 0.0281          |
+| No log        | 50.0  | 100  | 0.0224          |
+| No log        | 51.0  | 102  | 0.0151          |
+| No log        | 52.0  | 104  | 0.0173          |
+| No log        | 53.0  | 106  | 0.0261          |
+| No log        | 54.0  | 108  | 0.0216          |
+| No log        | 55.0  | 110  | 0.0145          |
+| No log        | 56.0  | 112  | 0.0123          |
+| No log        | 57.0  | 114  | 0.0136          |
+| No log        | 58.0  | 116  | 0.0113          |
+| No log        | 59.0  | 118  | 0.0086          |
+| No log        | 60.0  | 120  | 0.0066          |
+| No log        | 61.0  | 122  | 0.0040          |
+| No log        | 62.0  | 124  | 0.0040          |
+| No log        | 63.0  | 126  | 0.0019          |
+| No log        | 64.0  | 128  | 0.0009          |
+| No log        | 65.0  | 130  | 0.0071          |
+| No log        | 66.0  | 132  | 0.0208          |
+| No log        | 67.0  | 134  | 0.0154          |
+| No log        | 68.0  | 136  | 0.0036          |
+| No log        | 69.0  | 138  | 0.0021          |
+| No log        | 70.0  | 140  | 0.0011          |
+| No log        | 71.0  | 142  | 0.0011          |
+| No log        | 72.0  | 144  | 0.0023          |
+| No log        | 73.0  | 146  | 0.0024          |
+| No log        | 74.0  | 148  | 0.0014          |
+| No log        | 75.0  | 150  | 0.0005          |
+| No log        | 76.0  | 152  | 0.0003          |
+| No log        | 77.0  | 154  | 0.0003          |
+| No log        | 78.0  | 156  | 0.0002          |
+| No log        | 79.0  | 158  | 0.0001          |
+| No log        | 80.0  | 160  | 0.0001          |
+| No log        | 81.0  | 162  | 0.0001          |
+| No log        | 82.0  | 164  | 0.0001          |
+| No log        | 83.0  | 166  | 0.0001          |
+| No log        | 84.0  | 168  | 0.0001          |
+| No log        | 85.0  | 170  | 0.0001          |
+| No log        | 86.0  | 172  | 0.0001          |
+| No log        | 87.0  | 174  | 0.0001          |
+| No log        | 88.0  | 176  | 0.0001          |
+| No log        | 89.0  | 178  | 0.0001          |
+| No log        | 90.0  | 180  | 0.0001          |
+| No log        | 91.0  | 182  | 0.0001          |
+| No log        | 92.0  | 184  | 0.0001          |
+| No log        | 93.0  | 186  | 0.0001          |
+| No log        | 94.0  | 188  | 0.0001          |
+| No log        | 95.0  | 190  | 0.0001          |
+| No log        | 96.0  | 192  | 0.0001          |
+| No log        | 97.0  | 194  | 0.0001          |
+| No log        | 98.0  | 196  | 0.0001          |
+| No log        | 99.0  | 198  | 0.0001          |
+| No log        | 100.0 | 200  | 0.0001          |
+### Framework versions
+- Transformers 4.33.3
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.5
+- Tokenizers 0.13.3

config.json CHANGED Viewed

@@ -52,7 +52,7 @@
   "static_position_embeddings": false,
   "tokenizer_class": "MBart50Tokenizer",
   "torch_dtype": "float32",
-  "transformers_version": "4.33.2",
   "use_cache": true,
   "vocab_size": 250054
 }

   "static_position_embeddings": false,
   "tokenizer_class": "MBart50Tokenizer",
   "torch_dtype": "float32",
+  "transformers_version": "4.33.3",
   "use_cache": true,
   "vocab_size": 250054
 }

generation_config.json CHANGED Viewed

@@ -8,5 +8,5 @@
   "max_length": 200,
   "num_beams": 5,
   "pad_token_id": 1,
-  "transformers_version": "4.33.2"
 }

   "max_length": 200,
   "num_beams": 5,
   "pad_token_id": 1,
+  "transformers_version": "4.33.3"
 }

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:17a2f530a74f3cda932f2f8f0904ae85cd86c03d29effa28986a1fa69ed47b63
+size 4219