---
license: other
base_model: nvidia/mit-b0
tags:
- generated_from_trainer
model-index:
- name: image_segmentation_text_v2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# image_segmentation_text_v2

This model is a fine-tuned version of [nvidia/mit-b0](https://huggingface.co/nvidia/mit-b0) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2273
- Mean Iou: 0.7888
- Mean Accuracy: 0.8815
- Overall Accuracy: 0.9466
- Per Category Iou: [0.9411436688097194, 0.6365339140286779]
- Per Category Accuracy: [0.9666298272627839, 0.7963078307432842]

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Mean Iou | Mean Accuracy | Overall Accuracy | Per Category Iou                          | Per Category Accuracy                    |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:-------------:|:----------------:|:-----------------------------------------:|:----------------------------------------:|
| 0.6854        | 0.4   | 20   | 0.6497          | 0.5097   | 0.8417        | 0.7443           | [0.7115032459977992, 0.3077993102868545]  | [0.7144421527142827, 0.9689304620217858] |
| 0.5904        | 0.8   | 40   | 0.5143          | 0.5981   | 0.8004        | 0.8478           | [0.8333935175312267, 0.3627987944393932]  | [0.862384168080342, 0.7383421777739057]  |
| 0.4687        | 1.2   | 60   | 0.4546          | 0.6050   | 0.8161        | 0.8493           | [0.8342992363445556, 0.37569323794278087] | [0.8595097956337959, 0.7727067629087438] |
| 0.424         | 1.6   | 80   | 0.3844          | 0.6584   | 0.8272        | 0.8881           | [0.8773645633114495, 0.43953238690068724] | [0.9067963946134603, 0.7476733768251818] |
| 0.3485        | 2.0   | 100  | 0.3535          | 0.6942   | 0.8808        | 0.8995           | [0.8882922581800861, 0.5000111990627714]  | [0.9052579907882474, 0.8563377644392374] |
| 0.3402        | 2.4   | 120  | 0.3424          | 0.7021   | 0.9165        | 0.8977           | [0.8850157203428243, 0.5191490567411473]  | [0.891946363487535, 0.9410955056545037]  |
| 0.3037        | 2.8   | 140  | 0.3071          | 0.7357   | 0.9150        | 0.9173           | [0.9073510172037412, 0.5639900768760655]  | [0.9179667207588744, 0.9119966243321158] |
| 0.2669        | 3.2   | 160  | 0.2528          | 0.7659   | 0.8552        | 0.9410           | [0.9353899879693552, 0.5964963563086936]  | [0.9673518346317019, 0.7429811204460667] |
| 0.225         | 3.6   | 180  | 0.2523          | 0.7695   | 0.8894        | 0.9375           | [0.9308386361574114, 0.6082593419593063]  | [0.9523221012048093, 0.826397566814428]  |
| 0.2703        | 4.0   | 200  | 0.2430          | 0.7812   | 0.8943        | 0.9419           | [0.9355827639904635, 0.6268385945303665]  | [0.9564666637969618, 0.8320982324555644] |
| 0.2607        | 4.4   | 220  | 0.2352          | 0.7875   | 0.8918        | 0.9448           | [0.9389026387973957, 0.6361696692483538]  | [0.9610553391463873, 0.822527121722995]  |
| 0.2314        | 4.8   | 240  | 0.2273          | 0.7888   | 0.8815        | 0.9466           | [0.9411436688097194, 0.6365339140286779]  | [0.9666298272627839, 0.7963078307432842] |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0