license: apache-2.0 | |
tags: | |
- self-supervised learning | |
- vision | |
- SiT | |
inference: false | |
# Model description | |
SiT is a self-supervised learning model that combines masked image modeling and contrastive learning. The model is trained on ImageNet-1K. | |
# Model Sources | |
- https://github.com/Sara-Ahmed/SiT | |
- https://arxiv.org/abs/2104.03602 | |
# Model Card Authors | |
Sara Atito, Muhammad Awais, Josef Kittler | |
# How to use | |
```python | |
from modeling_sit import ViTSiTForPreTraining | |
# reload | |
model = ViTSiTForPreTraining.from_pretrained("erow/SiT") | |
``` | |
# BibTeX entry and citation info | |
``` | |
@inproceedings{atito2023sit, | |
title={SiT is all you need}, | |
author={Atito, Sara and Awais, Muhammed and Nandam, Srinivasa and Kittler, Josef}, | |
booktitle={2023 IEEE International Conference on Image Processing (ICIP)}, | |
pages={2125--2129}, | |
year={2023}, | |
organization={IEEE} | |
} | |
``` |