ALBERT Tiny Spanish

This is an ALBERT model trained on a big spanish corpora. The model was trained on a single TPU v3-8 with the following hyperparameters and steps/time:

  • LR: 0.00125
  • Batch Size: 2048
  • Warmup ratio: 0.0125
  • Warmup steps: 125000
  • Goal steps: 10000000
  • Total steps: 8300000
  • Total training time (aprox): 58.2 days

Training loss

https://drive.google.com/uc?export=view&id=1KQc8yWZLKvDLjBtu4IOAgpTx0iLcvX_Q

Downloads last month
14
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Dataset used to train dccuchile/albert-tiny-spanish