metadata
license: other
base_model: nvidia/mit-b0
tags:
- generated_from_trainer
model-index:
- name: image_segmentation_text_v2
results: []
image_segmentation_text_v2
This model is a fine-tuned version of nvidia/mit-b0 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2273
- Mean Iou: 0.7888
- Mean Accuracy: 0.8815
- Overall Accuracy: 0.9466
- Per Category Iou: [0.9411436688097194, 0.6365339140286779]
- Per Category Accuracy: [0.9666298272627839, 0.7963078307432842]
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Mean Iou | Mean Accuracy | Overall Accuracy | Per Category Iou | Per Category Accuracy |
---|---|---|---|---|---|---|---|---|
0.6854 | 0.4 | 20 | 0.6497 | 0.5097 | 0.8417 | 0.7443 | [0.7115032459977992, 0.3077993102868545] | [0.7144421527142827, 0.9689304620217858] |
0.5904 | 0.8 | 40 | 0.5143 | 0.5981 | 0.8004 | 0.8478 | [0.8333935175312267, 0.3627987944393932] | [0.862384168080342, 0.7383421777739057] |
0.4687 | 1.2 | 60 | 0.4546 | 0.6050 | 0.8161 | 0.8493 | [0.8342992363445556, 0.37569323794278087] | [0.8595097956337959, 0.7727067629087438] |
0.424 | 1.6 | 80 | 0.3844 | 0.6584 | 0.8272 | 0.8881 | [0.8773645633114495, 0.43953238690068724] | [0.9067963946134603, 0.7476733768251818] |
0.3485 | 2.0 | 100 | 0.3535 | 0.6942 | 0.8808 | 0.8995 | [0.8882922581800861, 0.5000111990627714] | [0.9052579907882474, 0.8563377644392374] |
0.3402 | 2.4 | 120 | 0.3424 | 0.7021 | 0.9165 | 0.8977 | [0.8850157203428243, 0.5191490567411473] | [0.891946363487535, 0.9410955056545037] |
0.3037 | 2.8 | 140 | 0.3071 | 0.7357 | 0.9150 | 0.9173 | [0.9073510172037412, 0.5639900768760655] | [0.9179667207588744, 0.9119966243321158] |
0.2669 | 3.2 | 160 | 0.2528 | 0.7659 | 0.8552 | 0.9410 | [0.9353899879693552, 0.5964963563086936] | [0.9673518346317019, 0.7429811204460667] |
0.225 | 3.6 | 180 | 0.2523 | 0.7695 | 0.8894 | 0.9375 | [0.9308386361574114, 0.6082593419593063] | [0.9523221012048093, 0.826397566814428] |
0.2703 | 4.0 | 200 | 0.2430 | 0.7812 | 0.8943 | 0.9419 | [0.9355827639904635, 0.6268385945303665] | [0.9564666637969618, 0.8320982324555644] |
0.2607 | 4.4 | 220 | 0.2352 | 0.7875 | 0.8918 | 0.9448 | [0.9389026387973957, 0.6361696692483538] | [0.9610553391463873, 0.822527121722995] |
0.2314 | 4.8 | 240 | 0.2273 | 0.7888 | 0.8815 | 0.9466 | [0.9411436688097194, 0.6365339140286779] | [0.9666298272627839, 0.7963078307432842] |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0