---
license: apache-2.0
datasets:
- ai4bharat/samanantar
language:
- en
tags:
- translation
---


This is the trained model file for `Ch1 - Attention is all you need`. This chapter creates a transformer from scratch for `English` to `Hindi` translation. Please use any of the checkpoints for inference.
Loss Graph:
![image.png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/8_J-C6FItlpHxQpihw-NN.png)

Training specs: Trained on Nvidia A10 GPU (24G) for 12hrs.

```json
return {
'batch_size': 85,
'num_samples': 1000000,
'num_epochs': 10,
'lr': 10**-4,
'seq_len': 128,
'd_model': 512,
'datasource': "runs",
'tgt_language': 'hi',
'model_folder': 'weights',
'model_basename': 'tmodel_',
'preload': None,
'tokenizer_folder': 'tokenizer',
'vocab_size': 52000,
}
```