vit-l-16-food-classifier

This model is a fine-tuned version of google/vit-large-patch16-384 on 30VNFOODS and Custom Vietnamese foods dataset crawled on Google and ShopeeFood. It achieves the following results on the evaluation set:

Loss: 0.3214
Accuracy: 0.9155

Model description

Fine-tuned ViT-L-16 in 3 epochs.

Intended uses & limitations

This model is used for our project at UET-VNU Campathon 2024.

Training and evaluation data

30VNFOODS Dataset and self-crawled dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	293	0.4138	0.8999
0.8179	2.0	586	0.3482	0.9151
0.8179	3.0	879	0.3214	0.9155

Framework versions

Transformers 4.42.3
Pytorch 2.1.2
Datasets 2.20.0
Tokenizers 0.19.1

VietTung04
/

vit-l-16-food-classifier