line-corporation/line-distilbert-base-japanese

This model is a fine-tuned version of line-corporation/line-distilbert-base-japanese on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0615

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 64
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: tpu
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 306 0.0856
0.1041 2.0 612 0.0819
0.1041 3.0 918 0.0795
0.0919 4.0 1224 0.0781
0.0876 5.0 1530 0.0770
0.0876 6.0 1836 0.0758
0.0845 7.0 2142 0.0751
0.0845 8.0 2448 0.0750
0.083 9.0 2754 0.0737
0.0809 10.0 3060 0.0732
0.0809 11.0 3366 0.0727
0.0802 12.0 3672 0.0722
0.0802 13.0 3978 0.0717
0.0797 14.0 4284 0.0721
0.078 15.0 4590 0.0711
0.078 16.0 4896 0.0707
0.0765 17.0 5202 0.0703
0.0774 18.0 5508 0.0699
0.0774 19.0 5814 0.0698
0.0762 20.0 6120 0.0696
0.0762 21.0 6426 0.0692
0.0756 22.0 6732 0.0691
0.0756 23.0 7038 0.0688
0.0756 24.0 7344 0.0687
0.075 25.0 7650 0.0680
0.075 26.0 7956 0.0680
0.0742 27.0 8262 0.0678
0.0738 28.0 8568 0.0677
0.0738 29.0 8874 0.0672
0.0742 30.0 9180 0.0673
0.0742 31.0 9486 0.0669
0.0733 32.0 9792 0.0669
0.0732 33.0 10098 0.0667
0.0732 34.0 10404 0.0664
0.0722 35.0 10710 0.0665
0.0728 36.0 11016 0.0662
0.0728 37.0 11322 0.0660
0.0719 38.0 11628 0.0659
0.0719 39.0 11934 0.0655
0.072 40.0 12240 0.0655
0.0721 41.0 12546 0.0654
0.0721 42.0 12852 0.0651
0.0711 43.0 13158 0.0651
0.0711 44.0 13464 0.0649
0.0715 45.0 13770 0.0651
0.0709 46.0 14076 0.0645
0.0709 47.0 14382 0.0644
0.0706 48.0 14688 0.0644
0.0706 49.0 14994 0.0642
0.0703 50.0 15300 0.0642
0.0706 51.0 15606 0.0641
0.0706 52.0 15912 0.0641
0.07 53.0 16218 0.0638
0.07 54.0 16524 0.0635
0.07 55.0 16830 0.0634
0.0695 56.0 17136 0.0634
0.0695 57.0 17442 0.0634
0.0701 58.0 17748 0.0633
0.0696 59.0 18054 0.0630
0.0696 60.0 18360 0.0637
0.0688 61.0 18666 0.0630
0.0688 62.0 18972 0.0629
0.0691 63.0 19278 0.0628
0.0692 64.0 19584 0.0627
0.0692 65.0 19890 0.0630
0.0694 66.0 20196 0.0625
0.0687 67.0 20502 0.0628
0.0687 68.0 20808 0.0623
0.0696 69.0 21114 0.0625
0.0696 70.0 21420 0.0624
0.0675 71.0 21726 0.0624
0.0688 72.0 22032 0.0622
0.0688 73.0 22338 0.0622
0.0682 74.0 22644 0.0621
0.0682 75.0 22950 0.0620
0.0683 76.0 23256 0.0620
0.0683 77.0 23562 0.0620
0.0683 78.0 23868 0.0620
0.0679 79.0 24174 0.0620
0.0679 80.0 24480 0.0619
0.0678 81.0 24786 0.0619
0.0679 82.0 25092 0.0618
0.0679 83.0 25398 0.0618
0.068 84.0 25704 0.0618
0.0684 85.0 26010 0.0617
0.0684 86.0 26316 0.0616
0.0676 87.0 26622 0.0617
0.0676 88.0 26928 0.0617
0.0676 89.0 27234 0.0617
0.0679 90.0 27540 0.0616
0.0679 91.0 27846 0.0616
0.0677 92.0 28152 0.0616
0.0677 93.0 28458 0.0616
0.067 94.0 28764 0.0615
0.0678 95.0 29070 0.0615
0.0678 96.0 29376 0.0615
0.067 97.0 29682 0.0615
0.067 98.0 29988 0.0615
0.0682 99.0 30294 0.0615
0.0681 100.0 30600 0.0615

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.0+cu118
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
10
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for liwii/factual-consistency-regression-ja

Finetuned
(19)
this model