|
--- |
|
license: cc-by-4.0 |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** [More Information Needed] |
|
- **Funded by [optional]:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] |
|
- **Language(s) (NLP):** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
--- |
|
Table 1: Linear probing results on six classification tasks. All models are trained |
|
for 50 epochs. The reported numbers are top-1 overall accuracy (OA). Missing values |
|
are due to the inability of the model to adapt to this domain. |
|
|
|
| Method | Backbone | m-bigearthnet | m-forestnet | m-brick-kiln | m-pv4ger | m-so2sat | m-eurosat | |
|
|--------------------|-------------|---------------|-------------|--------------|----------|----------|-----------| |
|
| **Fully Trained** | ViT-S | 66.0 | 53.8 | 98.1 | 97.6 | 57.5 | 97.3 | |
|
| **Fully Trained** | SwinV2-T | 70.0 | 58.0 | 98.7 | 98.0 | 56.1 | 97.4 | |
|
| **Fully Trained** | ConvNext-B | 69.1 | 56.8 | 98.9 | 98.0 | 58.1 | 97.7 | |
|
| **rand. init.** | ViT-B | 52.9 | 41.5 | 84.5 | 91.3 | 38.3 | 85.7 | |
|
| **MAE_Single [44]**| ViT-B | 63.6 | - | 88.9 | 92.2 | 50.0 | 88.9 | |
|
| **OFA-Net [43]** | ViT-B | 65.0 | - | 94.7 | 93.2 | 49.4 | 91.9 | |
|
| **SatMAE [25]** | ViT-B | 62.1 | - | 93.9 | - | 46.9 | 86.4 | |
|
| **Scale-MAE [22]** | ViT-L | - | - | - | 96.9 | - | - | |
|
| **GFM [21]** | Swin-B | - | - | - | 96.8 | - | - | |
|
| **Cross-Scale MAE [23]** | ViT-B | - | - | - | 93.1 | - | - | |
|
| **FG-MAE [24]** | ViT-B | 63.0 | - | 94.7 | - | 51.4 | 87.0 | |
|
| **CROMA [27]** | ViT-B | 67.4 | - | 91.0 | - | 49.2 | 90.1 | |
|
| **DOFA** | ViT-B | 65.7 | 50.9 | 95.8 | 96.9 | 55.1 | 93.9 | |
|
| **DOFA** | ViT-L | **67.5** | **54.6** | **96.9** | **97.3** | **60.1** | **97.1** | |
|
|
|
|
|
|
|
Table 2: Partial fine-tuning results on six segmentation tasks. All models are |
|
trained with a frozen backbone for 20 epochs. Reported numbers are mean intersection |
|
over union (mIoU). Missing values are due to the inability of the model to adapt to |
|
this domain. |
|
|
|
| Method | Backbone | m-pv4ger-seg | m-nz-cattle | m-NeonTree | m-cashew-plant | m-SA-crop | m-chesapeake | |
|
|--------------------|-------------|--------------|-------------|------------|----------------|-----------|--------------| |
|
| **DeepLabv3** | ResNet101 | 93.4 | 67.6 | 53.9 | 48.6 | 30.4 | 62.1 | |
|
| **U-Net** | ResNet101 | 94.1 | 80.5 | 56.6 | 46.6 | 29.9 | 70.8 | |
|
| **rand. init.** | ViT-B | 81.7 | 74.1 | 51.7 | 32.4 | 29.0 | 47.1 | |
|
| **MAE_Single [44]**| ViT-B | 88.4 | 76.4 | 53.0 | 40.7 | 30.7 | 51.9 | |
|
| **OFA-Net [43]** | ViT-B | 89.4 | 77.6 | 53.3 | 47.9 | 31.9 | 54.5 | |
|
| **Scale-MAE [22]** | ViT-L | 83.5 | 76.5 | 51.0 | - | - | 61.0 | |
|
| **GFM [21]** | Swin-B | 92.0 | 75.0 | 51.1 | - | - | 63.8 | |
|
| **Cross-Scale MAE [23]** | ViT-B | 83.2 | 77.9 | 52.1 | - | - | 52.3 | |
|
| **CROMA [27]** | ViT-B | - | - | - | 30.1 | 31.4 | - | |
|
| **FG-MAE [24]** | ViT-B | - | - | - | 40.8 | 30.6 | - | |
|
| **DOFA** | ViT-B | 94.5 | 81.4 | 58.8 | 51.5 | **33.0** | 65.3 | |
|
| **DOFA** | ViT-L | **95.0** | **81.8** | **59.4** | **56.9** | **32.1** | **66.3** | |
|
|
|
--- |
|
|
|
## Uses |
|
|
|
Please refer to the Github repo [DOFA](https://github.com/zhu-xlab/DOFA) for more details. |