muril-large-cased-tweet-devnagri-grouped

This model is a fine-tuned version of google/muril-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4110

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.0478	5000	2.5496
No log	0.0955	10000	2.1840
No log	0.1433	15000	2.0172
No log	0.1910	20000	1.9188
No log	0.2388	25000	1.8525
No log	0.2865	30000	1.8047
No log	0.3343	35000	1.7694
No log	0.3820	40000	1.7406
No log	0.4298	45000	1.7076
No log	0.4775	50000	1.6848
No log	0.5253	55000	1.6713
No log	0.5730	60000	1.6543
No log	0.6208	65000	1.6364
No log	0.6685	70000	1.6226
No log	0.7163	75000	1.6103
No log	0.7640	80000	1.5976
No log	0.8118	85000	1.5925
No log	0.8595	90000	1.5883
No log	0.9073	95000	1.5763
No log	0.9550	100000	1.5581
1.9195	1.0028	105000	1.5774
1.9195	1.0505	110000	1.5507
1.9195	1.0983	115000	1.5728
1.9195	1.1460	120000	1.5328
1.9195	1.1938	125000	1.5265
1.9195	1.2415	130000	1.5199
1.9195	1.2893	135000	1.5216
1.9195	1.3370	140000	1.5098
1.9195	1.3848	145000	1.5061
1.9195	1.4325	150000	1.4985
1.9195	1.4803	155000	1.4943
1.9195	1.5280	160000	1.4933
1.9195	1.5758	165000	1.4853
1.9195	1.6235	170000	1.4778
1.9195	1.6713	175000	1.4797
1.9195	1.7190	180000	1.4702
1.9195	1.7668	185000	1.4958
1.9195	1.8145	190000	1.4683
1.9195	1.8623	195000	1.4748
1.9195	1.9100	200000	1.4560
1.9195	1.9578	205000	1.4553
1.5744	2.0055	210000	1.4431
1.5744	2.0533	215000	1.4432
1.5744	2.1010	220000	1.4446
1.5744	2.1488	225000	1.4407
1.5744	2.1965	230000	1.4454
1.5744	2.2443	235000	1.4371
1.5744	2.2920	240000	1.4351
1.5744	2.3398	245000	1.4291
1.5744	2.3875	250000	1.4293
1.5744	2.4353	255000	1.4245
1.5744	2.4830	260000	1.4253
1.5744	2.5308	265000	1.4305
1.5744	2.5785	270000	1.4221
1.5744	2.6263	275000	1.4181
1.5744	2.6740	280000	1.4146
1.5744	2.7218	285000	1.4149
1.5744	2.7695	290000	1.4131
1.5744	2.8173	295000	1.4155
1.5744	2.8650	300000	1.4137
1.5744	2.9128	305000	1.4119
1.5744	2.9605	310000	1.4070

Framework versions

Transformers 4.45.0
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.0

Anish
/

muril-large-cased-tweet-devnagri-grouped

muril-large-cased-tweet-devnagri-grouped

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Anish/muril-large-cased-tweet-devnagri-grouped

Evaluation results