binarization-segformer-b3

This model is a fine-tuned version of nvidia/segformer-b3-1024-1024 on the same ensemble of 13 datasets as the SauvolaNet work publicly available in their GitHub repository.

It achieves the following results on the evaluation set on DIBCO metrics:

  • loss: 0.0743
  • DRD: 5.9548
  • F-measure: 0.9840
  • pseudo F-measure: 0.9740
  • PSNR: 16.0119

with PSNR the peak signal-to-noise ratio and DRD the distance reciprocal distortion.

For more information on the above DIBCO metrics, see the 2017 introductory paper.

Model description

This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO). This is in contrast to the late trend of adapting classical binarization algorithms with neural networks, such as DeepOtsu or SauvolaNet as extensions of Otsu's method and Sauvola thresholding algorithm, respectively.

Intended uses & limitations

TBC

Training and evaluation data

TBC

Training procedure

Training hyperparameters

TBC

Training results

training loss epoch step validation loss DRD F-measure pseudo F-measure PSNR
0.6983 0.26 10 0.7079 199.5096 0.5945 0.5801 3.4552
0.6657 0.52 20 0.6755 149.2346 0.7006 0.6165 4.6752
0.6145 0.77 30 0.6433 109.7298 0.7831 0.6520 5.5489
0.5553 1.03 40 0.5443 53.7149 0.8952 0.8000 8.1736
0.4627 1.29 50 0.4896 32.7649 0.9321 0.8603 9.8706
0.3969 1.55 60 0.4327 21.5508 0.9526 0.8985 11.3400
0.3414 1.81 70 0.3002 11.0094 0.9732 0.9462 13.5901
0.2898 2.06 80 0.2839 10.1064 0.9748 0.9563 13.9796
0.2292 2.32 90 0.2427 9.4437 0.9761 0.9584 14.2161
0.2153 2.58 100 0.2095 8.8696 0.9771 0.9621 14.4319
0.1767 2.84 110 0.1916 8.6152 0.9776 0.9646 14.5528
0.1509 3.1 120 0.1704 8.0761 0.9791 0.9632 14.7961
0.1265 3.35 130 0.1561 8.5627 0.9784 0.9655 14.7400
0.132 3.61 140 0.1318 8.1849 0.9788 0.9670 14.8469
0.1115 3.87 150 0.1317 7.8438 0.9790 0.9657 14.9072
0.0983 4.13 160 0.1273 7.9405 0.9791 0.9673 14.9701
0.1001 4.39 170 0.1234 8.4132 0.9788 0.9691 14.8573
0.0862 4.65 180 0.1147 8.0838 0.9797 0.9678 15.0433
0.0713 4.9 190 0.1134 7.6027 0.9806 0.9687 15.2235
0.0905 5.16 200 0.1061 7.2973 0.9803 0.9699 15.1646
0.0902 5.42 210 0.1061 8.4049 0.9787 0.9699 14.8460
0.0759 5.68 220 0.1062 7.7147 0.9809 0.9695 15.2426
0.0638 5.94 230 0.1019 7.7449 0.9806 0.9695 15.2195
0.0852 6.19 240 0.0962 7.0221 0.9817 0.9693 15.4730
0.0677 6.45 250 0.0961 7.2520 0.9814 0.9710 15.3878
0.0668 6.71 260 0.0972 6.6658 0.9823 0.9689 15.6106
0.0701 6.97 270 0.0909 6.9454 0.9820 0.9713 15.5458
0.0567 7.23 280 0.0925 6.5498 0.9824 0.9718 15.5965
0.0624 7.48 290 0.0899 7.3125 0.9813 0.9717 15.3255
0.0649 7.74 300 0.0932 7.4915 0.9816 0.9684 15.5666
0.0524 8.0 310 0.0905 7.1666 0.9815 0.9711 15.4526
0.0693 8.26 320 0.0901 6.5627 0.9827 0.9704 15.7335
0.0528 8.52 330 0.0845 6.6690 0.9826 0.9734 15.5950
0.0632 8.77 340 0.0822 6.2661 0.9833 0.9723 15.8631
0.0522 9.03 350 0.0844 6.0073 0.9836 0.9715 15.9393
0.0568 9.29 360 0.0817 5.9460 0.9837 0.9721 15.9523
0.057 9.55 370 0.0900 7.9726 0.9812 0.9730 15.1229
0.052 9.81 380 0.0836 6.5444 0.9822 0.9712 15.6388
0.0568 10.06 390 0.0810 6.0359 0.9836 0.9714 15.9796
0.0481 10.32 400 0.0784 6.2110 0.9835 0.9724 15.9235
0.0513 10.58 410 0.0803 6.0990 0.9835 0.9715 15.9502
0.0595 10.84 420 0.0798 6.0829 0.9835 0.9720 15.9052
0.047 11.1 430 0.0779 5.8847 0.9838 0.9725 16.0043
0.0406 11.35 440 0.0802 5.7944 0.9838 0.9713 16.0620
0.0493 11.61 450 0.0781 6.0947 0.9836 0.9731 15.9033
0.064 11.87 460 0.0769 6.1257 0.9837 0.9736 15.9080
0.0622 12.13 470 0.0765 6.2964 0.9835 0.9739 15.8188
0.0457 12.39 480 0.0773 5.9826 0.9838 0.9728 16.0119
0.0447 12.65 490 0.0761 5.7977 0.9841 0.9728 16.0900
0.0515 12.9 500 0.0750 5.8569 0.9840 0.9729 16.0633
0.0357 13.16 510 0.0796 5.7990 0.9837 0.9713 16.0818
0.0503 13.42 520 0.0749 5.8323 0.9841 0.9736 16.0510
0.0508 13.68 530 0.0746 6.0361 0.9839 0.9735 15.9709
0.0533 13.94 540 0.0768 6.1596 0.9836 0.9740 15.9193
0.0503 14.19 550 0.0739 5.5900 0.9843 0.9723 16.1883
0.0515 14.45 560 0.0740 5.4660 0.9845 0.9727 16.2745
0.0502 14.71 570 0.0740 5.5895 0.9844 0.9736 16.2054
0.0401 14.97 580 0.0741 5.9694 0.9840 0.9747 15.9603
0.0495 15.23 590 0.0745 5.9136 0.9841 0.9740 16.0458
0.0413 15.48 600 0.0743 5.9548 0.9840 0.9740 16.0119

Framework versions

  • transformers 4.31.0
  • torch 2.0.0
  • datasets 2.13.1
  • tokenizers 0.13.3
Downloads last month
57
Safetensors
Model size
47.2M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for DiTo97/binarization-segformer-b3

Finetunes
9 models