muril-large-cased-tweet-devnagri-grouped

This model is a fine-tuned version of google/muril-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4110

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
No log 0.0478 5000 2.5496
No log 0.0955 10000 2.1840
No log 0.1433 15000 2.0172
No log 0.1910 20000 1.9188
No log 0.2388 25000 1.8525
No log 0.2865 30000 1.8047
No log 0.3343 35000 1.7694
No log 0.3820 40000 1.7406
No log 0.4298 45000 1.7076
No log 0.4775 50000 1.6848
No log 0.5253 55000 1.6713
No log 0.5730 60000 1.6543
No log 0.6208 65000 1.6364
No log 0.6685 70000 1.6226
No log 0.7163 75000 1.6103
No log 0.7640 80000 1.5976
No log 0.8118 85000 1.5925
No log 0.8595 90000 1.5883
No log 0.9073 95000 1.5763
No log 0.9550 100000 1.5581
1.9195 1.0028 105000 1.5774
1.9195 1.0505 110000 1.5507
1.9195 1.0983 115000 1.5728
1.9195 1.1460 120000 1.5328
1.9195 1.1938 125000 1.5265
1.9195 1.2415 130000 1.5199
1.9195 1.2893 135000 1.5216
1.9195 1.3370 140000 1.5098
1.9195 1.3848 145000 1.5061
1.9195 1.4325 150000 1.4985
1.9195 1.4803 155000 1.4943
1.9195 1.5280 160000 1.4933
1.9195 1.5758 165000 1.4853
1.9195 1.6235 170000 1.4778
1.9195 1.6713 175000 1.4797
1.9195 1.7190 180000 1.4702
1.9195 1.7668 185000 1.4958
1.9195 1.8145 190000 1.4683
1.9195 1.8623 195000 1.4748
1.9195 1.9100 200000 1.4560
1.9195 1.9578 205000 1.4553
1.5744 2.0055 210000 1.4431
1.5744 2.0533 215000 1.4432
1.5744 2.1010 220000 1.4446
1.5744 2.1488 225000 1.4407
1.5744 2.1965 230000 1.4454
1.5744 2.2443 235000 1.4371
1.5744 2.2920 240000 1.4351
1.5744 2.3398 245000 1.4291
1.5744 2.3875 250000 1.4293
1.5744 2.4353 255000 1.4245
1.5744 2.4830 260000 1.4253
1.5744 2.5308 265000 1.4305
1.5744 2.5785 270000 1.4221
1.5744 2.6263 275000 1.4181
1.5744 2.6740 280000 1.4146
1.5744 2.7218 285000 1.4149
1.5744 2.7695 290000 1.4131
1.5744 2.8173 295000 1.4155
1.5744 2.8650 300000 1.4137
1.5744 2.9128 305000 1.4119
1.5744 2.9605 310000 1.4070

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
40
Safetensors
Model size
506M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Anish/muril-large-cased-tweet-devnagri-grouped

Finetuned
(4)
this model