Ramos-Ramos
/

dino-resnet-50

Image Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

dino-resnet-50 / README.md

patrickramos's picture

Update README.md

df82fcd about 2 years ago

|

history blame contribute delete

2.65 kB

	---
	tags:
	- dino
	- vision

	datasets:
	- imagenet-1k
	---

	# DINO ResNet-50

	ResNet-50 pretrained with DINO. DINO was introduced in [Emerging Properties in Self-Supervised Vision Transformers](https://arxiv.org/abs/2104.14294), while ResNet was introduced in [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385). The official implementation of a DINO Resnet-50 can be found [here](https://github.com/facebookresearch/dino).

	Weights converted from the official [DINO ResNet](https://github.com/facebookresearch/dino#pretrained-models-on-pytorch-hub) using [this script](https://colab.research.google.com/drive/1Ax3IDoFPOgRv4l7u6uS8vrPf4TX827BK?usp=sharing).

	For up-to-date model card information, please see the [original repo](https://github.com/facebookresearch/dino).

	### How to use

	Warning: The feature extractor in this repo is a copy of the one from [`microsoft/resnet-50`](https://huggingface.co/microsoft/resnet-50). We never verified if this image prerprocessing is the one used with DINO ResNet-50.

	```python
	from transformers import AutoFeatureExtractor, ResNetModel
	from PIL import Image
	import requests

	url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
	image = Image.open(requests.get(url, stream=True).raw)

	feature_extractor = AutoFeatureExtractor.from_pretrained('Ramos-Ramos/dino-resnet-50')
	model = ResNetModel.from_pretrained('Ramos-Ramos/dino-resnet-50')
	inputs = feature_extractor(images=image, return_tensors="pt")
	outputs = model(**inputs)
	last_hidden_states = outputs.last_hidden_state
	```

	### BibTeX entry and citation info

	```bibtex
	@article{DBLP:journals/corr/abs-2104-14294,
	author = {Mathilde Caron and
	Hugo Touvron and
	Ishan Misra and
	Herv{\'{e}} J{\'{e}}gou and
	Julien Mairal and
	Piotr Bojanowski and
	Armand Joulin},
	title = {Emerging Properties in Self-Supervised Vision Transformers},
	journal = {CoRR},
	volume = {abs/2104.14294},
	year = {2021},
	url = {https://arxiv.org/abs/2104.14294},
	archivePrefix = {arXiv},
	eprint = {2104.14294},
	timestamp = {Tue, 04 May 2021 15:12:43 +0200},
	biburl = {https://dblp.org/rec/journals/corr/abs-2104-14294.bib},
	bibsource = {dblp computer science bibliography, https://dblp.org}
	}
	```

	```bibtex
	@inproceedings{he2016deep,
	title={Deep residual learning for image recognition},
	author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
	booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
	pages={770--778},
	year={2016}
	}
	```