metadata

license: apache-2.0
base_model: google/vit-base-patch16-224-in21k
tags:
  - image-classification
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: finetuned-clothes
    results: []

finetuned-clothes

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the clothes_simplifiedv2 dataset. It achieves the following results on the evaluation set:

Loss: 0.2225
Accuracy: 0.9417

Model description

This model classifies clothes category based on the given image.

Intended uses

You can use it in a jupyter notebook:

from PIL import Image
import requests

url = 'insert image url here'
image = Image.open(requests.get(url, stream=True).raw)

from transformers import AutoModelForImageClassification, AutoImageProcessor

repo_name = "samokosik/finetuned-clothes"

image_processor = AutoImageProcessor.from_pretrained(repo_name)
model = AutoModelForImageClassification.from_pretrained(repo_name)

encoding = image_processor(image.convert("RGB"), return_tensors="pt")
print(encoding.pixel_values.shape)

import torch
with torch.no_grad():
  outputs = model(**encoding)
  logits = outputs.logits

predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])

Limitations

Due to lack of available data, we support only these categories: hat, longsleeve, outswear, pants, shoes, shorts, shortsleve.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 4
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.7725	0.2058	100	0.7008	0.8178
0.5535	0.4115	200	0.4494	0.8994
0.4334	0.6173	300	0.3649	0.9169
0.3921	0.8230	400	0.3085	0.9184
0.3695	1.0288	500	0.3091	0.9184
0.2634	1.2346	600	0.3339	0.9082
0.4788	1.4403	700	0.2827	0.9257
0.3337	1.6461	800	0.2499	0.9344
0.34	1.8519	900	0.2586	0.9315
0.2424	2.0576	1000	0.2248	0.9402
0.1559	2.2634	1100	0.2333	0.9344
0.351	2.4691	1200	0.2495	0.9359
0.2206	2.6749	1300	0.2622	0.9242
0.3814	2.8807	1400	0.3138	0.9155
0.2141	3.0864	1500	0.2613	0.9315
0.112	3.2922	1600	0.2266	0.9402
0.0631	3.4979	1700	0.2255	0.9402
0.1986	3.7037	1800	0.2225	0.9417
0.2345	3.9095	1900	0.2235	0.9373

Framework versions

Transformers 4.40.1
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.19.1

Training dataset

This model was trained on the following dataset: https://huggingface.co/datasets/samokosik/clothes_simplifiedv2