IMDb_data_subset-MLM_with-custom_collator-distilbert-base-uncased
This model is a fine-tuned version of distilbert-base-uncased on IMDb dataset. It achieves the following results on the evaluation set:
- Loss: 3.2538
- Model Preparation Time: 0.0042
Model description
Intended uses & limitations
Mask filling
Training and evaluation data
The IMDb dataset is tokenized, and words are masked with 0.2 probability. The resulting dataset is downsampled, resulting in 10,000 training samples and 1,000 validation samples.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time |
|---|---|---|---|---|
| 3.5579 | 1.0 | 157 | 3.3058 | 0.0042 |
| 3.3945 | 2.0 | 314 | 3.2732 | 0.0042 |
| 3.3487 | 3.0 | 471 | 3.2542 | 0.0042 |
| 3.3088 | 4.0 | 628 | 3.2237 | 0.0042 |
| 3.2961 | 5.0 | 785 | 3.2538 | 0.0042 |
Framework versions
- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- -
Model tree for srvmishra832/IMDb_data_subset-MLM_with-custom_collator-distilbert-base-uncased
Base model
distilbert/distilbert-base-uncased