|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- ILSVRC/imagenet-1k |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
pipeline_tag: image-classification |
|
--- |
|
|
|
# **GenView Pretrained Models** |
|
|
|
## Model Name |
|
**GenView: Enhancing View Quality with Pretrained Generative Models** |
|
|
|
### Summary |
|
This repository hosts pretrained models developed as part of the GenView framework, introduced in the ECCV 2024 paper *GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning*. These models are designed for visual representation tasks, including image classification, multimodal learning, and feature extraction. GenView leverages generative models to enhance self-supervised learning by improving view quality and diversity. |
|
|
|
--- |
|
|
|
## Table of Contents |
|
1. [Model Details](#model-details) |
|
2. [Evaluation](#evaluation) |
|
3. [Citation](#citation) |
|
4. [How to Download the Model](#how-to-download-the-model) |
|
|
|
--- |
|
|
|
## Model Details |
|
|
|
### **Model Description** |
|
The GenView pretrained models include both convolutional architectures (e.g., ResNet50) and transformer-based architectures (e.g., ViT-B). These models utilize advanced self-supervised learning methods such as SimSiam, MoCo, and BYOL. By incorporating generative models for adaptive view generation, the framework delivers superior feature representations. |
|
|
|
- **Developed by:** Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang |
|
- **Funded by:** Harbin Institute of Technology, Shenzhen; Peng Cheng Laboratory; KAUST; NTU |
|
- **Shared by:** Xiaojie Li |
|
- **Model type:** Self-supervised learning for vision tasks |
|
- **Language:** Vision-focused (not language-specific) |
|
- **License:** Apache 2.0 |
|
|
|
### **Model Sources** |
|
- **Hugging Face Repository:** [GenView Pretrained Models](https://huggingface.co/Xiaojie0903/genview_pretrained_models) |
|
- **GitHub Repository:** [GenView Official Code](https://github.com/xiaojieli0903/genview/) |
|
- **Paper:** [GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning (ECCV 2024)](https://arxiv.org/abs/2403.12003) |
|
|
|
--- |
|
|
|
## Evaluation |
|
|
|
### **Testing Data** |
|
Linear Probe evaluation was conducted using the ImageNet-1K dataset. |
|
|
|
### **Metrics** |
|
The models were evaluated based on Top-1 accuracy. |
|
|
|
### **Results** |
|
|
|
| Method | Backbone | Pretraining Epochs | Linear Probe Accuracy (%) | |
|
|-------------------|--------------|---------------------|----------------------------| |
|
| MoCo v2 + GenView| ResNet-50 | 200 | 70.0 | |
|
| SwAV + GenView | ResNet-50 | 200 | 71.7 | |
|
| SimSiam + GenView| ResNet-50 | 200 | 72.2 | |
|
| BYOL + GenView | ResNet-50 | 200 | 73.2 | |
|
| MoCo v3 + GenView| ResNet-50 | 100 | 72.7 | |
|
| MoCo v3 + GenView| ResNet-50 | 300 | 74.8 | |
|
| MoCo v3 + GenView| ViT-S | 300 | 74.5 | |
|
| MoCo v3 + GenView| ViT-B | 300 | 77.8 | |
|
|
|
--- |
|
|
|
## Citation |
|
|
|
If you use these models, please cite the GenView paper: |
|
|
|
```bibtex |
|
@inproceedings{li2023genview, |
|
author={Li, Xiaojie and Yang, Yibo and Li, Xiangtai and Wu, Jianlong and Yu, Yue and Ghanem, Bernard and Zhang, Min}, |
|
title={GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning}, |
|
year={2024}, |
|
booktitle={Proceedings of the European Conference on Computer Vision}, |
|
pages={306--325}, |
|
publisher="Springer" |
|
} |
|
``` |
|
--- |
|
## How to Download the Model |
|
|
|
### **Downloading Models** |
|
To download models, use the following commands: |
|
|
|
#### Option 1: `wget` |
|
```bash |
|
# Replace {MODEL_FILE} with the specific model file name |
|
wget https://huggingface.co/Xiaojie0903/genview_pretrained_models/resolve/main/{MODEL_FILE} |
|
``` |
|
|
|
Example: |
|
```bash |
|
wget https://huggingface.co/Xiaojie0903/genview_pretrained_models/resolve/main/mocov3_resnet50_8xb512-amp-coslr-100e_in1k_genview.pth |
|
``` |
|
|
|
#### Option 2: Hugging Face Python API |
|
```python |
|
from huggingface_hub import hf_hub_download |
|
|
|
# Replace with your desired model file |
|
file_path = hf_hub_download( |
|
repo_id="Xiaojie0903/genview_pretrained_models", |
|
filename="mocov3_resnet50_8xb512-amp-coslr-100e_in1k_genview.pth" |
|
) |
|
print(f"Model downloaded to {file_path}") |
|
``` |