microsoft-swinv2-small-patch4-window16-256-finetuned-xblockm

This model is a fine-tuned version of microsoft/swinv2-small-patch4-window16-256 on the howdyaendra/xblock-social-screenshots dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1252
  • Roc Auc: 0.9535

Model description

This model is trained on several thousand screenshots reported to the XBlock 3rd-party Bluesky labeller service. It is intended to be used to label Bluesky posts that have screenshots from social media sites embedded in them. Please also see aendra-rininsland/xblock.

Intended uses & limitations

Screenshot moderation

Training and evaluation data

20% split of 1618 images

Training procedure

See notebook.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Roc Auc
0.4357 0.9877 20 0.2544 0.7784
0.2027 1.9753 40 0.2016 0.8431
0.1743 2.9630 60 0.1701 0.8912
0.1625 4.0 81 0.1677 0.9083
0.1321 4.9877 101 0.1447 0.9246
0.1155 5.9753 121 0.1418 0.9311
0.0959 6.9630 141 0.1381 0.9460
0.0788 7.9012 160 0.1252 0.9535

Framework versions

  • Transformers 4.44.1
  • Pytorch 2.2.2
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
80
Safetensors
Model size
49M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for howdyaendra/microsoft-swinv2-small-patch4-window16-256-finetuned-xblockm

Quantized
(1)
this model

Dataset used to train howdyaendra/microsoft-swinv2-small-patch4-window16-256-finetuned-xblockm