--- license: apache-2.0 datasets: - ai4bharat/samanantar language: - en tags: - translation --- This is the trained model file for `Ch1 - Attention is all you need`. This chapter creates a transformer from scratch for `English` to `Hindi` translation. Please use any of the checkpoints for inference. Loss Graph: ![image.png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/8_J-C6FItlpHxQpihw-NN.png) Training specs: Trained on Nvidia A10 GPU (24G) for 12hrs. ```json return { 'batch_size': 85, 'num_samples': 1000000, 'num_epochs': 10, 'lr': 10**-4, 'seq_len': 128, 'd_model': 512, 'datasource': "runs", 'tgt_language': 'hi', 'model_folder': 'weights', 'model_basename': 'tmodel_', 'preload': None, 'tokenizer_folder': 'tokenizer', 'vocab_size': 52000, } ```