Update README.md
Browse files
README.md
CHANGED
@@ -2,80 +2,225 @@
|
|
2 |
license: mit
|
3 |
---
|
4 |
|
5 |
-
# AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
|
6 |
|
7 |
[Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)
|
8 |
|
|
|
|
|
9 |
<p align="center">
|
10 |
-
<img src="
|
11 |
</p>
|
12 |
|
13 |
-
## Abstract
|
14 |
|
15 |
-
|
16 |
|
17 |
-
|
|
|
|
|
18 |
|
19 |
<p align="center">
|
20 |
-
<img src="
|
21 |
</p>
|
22 |
|
23 |
-
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
```python
|
28 |
-
from models.huggingface import AnySat
|
29 |
|
30 |
-
|
31 |
-
|
|
|
|
|
32 |
```
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
To get features from an observation of a batch of observations, you need to provide to the model a dictionnary where keys are from the list:
|
35 |
| Dataset | Description | Tensor Size | Channels | Resolution |
|
36 |
|---------------|-----------------------------------|-----------------------------------------|-------------------------------------------|------------|
|
37 |
| aerial | Single date tensor |Bx4xHxW | RGB, NiR | 0.2m |
|
38 |
| aerial-flair | Single date tensor |Bx5xHxW | RGB, NiR, Elevation | 0.2m |
|
39 |
| spot | Single date tensor |Bx3xHxW | RGB | 1m |
|
40 |
-
| naip | Single date tensor |Bx4xHxW
|
41 |
-
| s2 | Time series tensor |BxTx10xHxW
|
42 |
-
| s1-asc | Time series tensor |BxTx2xHxW
|
43 |
| s1 | Time series tensor |BxTx3xHxW | VV, VH, Ratio | 10m |
|
44 |
| alos | Time series tensor |BxTx3xHxW | HH, HV, Ratio | 30m |
|
45 |
| l7 | Time series tensor |BxTx6xHxW | B1, B2, B3, B4, B5, B7 | 30m |
|
46 |
-
| l8 | Time series tensor |BxTx11xHxW | B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 | 10m
|
47 |
| modis | Time series tensor |BxTx7xHxW | B1, B2, B3, B4, B5, B6, B7 | 250m |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
|
|
49 |
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
|
|
|
|
|
|
|
|
55 |
|
|
|
56 |
```python
|
57 |
-
features = AnySat(data, scale=
|
|
|
|
|
|
|
58 |
```
|
|
|
59 |
|
60 |
-
|
61 |
|
62 |
-
|
63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
```python
|
65 |
-
|
|
|
|
|
|
|
|
|
|
|
66 |
```
|
67 |
|
68 |
-
|
69 |
|
|
|
|
|
|
|
70 |
|
71 |
-
|
72 |
-
|
73 |
|
|
|
|
|
|
|
74 |
|
75 |
-
|
76 |
|
77 |
-
|
78 |
|
79 |
-
|
|
|
|
|
80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: mit
|
3 |
---
|
4 |
|
5 |
+
# AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
|
6 |
|
7 |
[Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)
|
8 |
|
9 |
+
For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html).
|
10 |
+
|
11 |
<p align="center">
|
12 |
+
<img src=".media/image.png" alt="AnySat Architecture" width="500">
|
13 |
</p>
|
14 |
|
|
|
15 |
|
16 |
+
# Abstract
|
17 |
|
18 |
+
**AnySat** is a versatile Earth Observation model designed to handle diverse data across resolutions, scales, and modalities. Using a **scale-adaptive joint embedding predictive architecture** (JEPA), AnySat can train in a self-supervised manner on highly heterogeneous datasets.
|
19 |
+
|
20 |
+
We train a single AnySat model on **GeoPlex**, a collection of 5 multimodal datasets spanning 11 sensors with varying characteristics. In fine-tuning or linear probing, AnySat achieves SOTA or near-SOTA performance for land cover segmentation, crop type classification, change detection, tree species identification, and flood mapping.
|
21 |
|
22 |
<p align="center">
|
23 |
+
<img src=".media/teaser.png" alt="AnySat Teaser" width="500">
|
24 |
</p>
|
25 |
|
26 |
+
# Key Features
|
27 |
+
- 🌍 **Versatile Model**:Handles diverse datasets with resolutions spanning **3–11 channels**, tiles ranging from **0.3 to 2600 hectares**, and any combination of **11 sensors**.
|
28 |
+
- 🚀 **Simple to Use**: Install and download AnySat with a single line of code, select your desired modalities and patch size, and immediately generate rich features.
|
29 |
+
- 🦋 **Flexible Task Adaptation**: Supports fine-tuning and linear probing for tasks like **tile-wise classification** and **semantic segmentation**.
|
30 |
+
- 🧑🎓 **Multi-dataset Training**: Trains a single model across multiple datasets with varying characteristics.
|
31 |
|
32 |
+
|
33 |
+
# 🚀 Quickstart
|
34 |
+
|
35 |
+
Check out our [demo notebook](demo.ipynb) or [huggingface page](https://huggingface.co/gastruc/anysat) for more details.
|
36 |
+
|
37 |
+
## Install and load Anysat
|
38 |
|
39 |
```python
|
|
|
40 |
|
41 |
+
import torch
|
42 |
+
|
43 |
+
AnySat = torch.hub.load('gastruc/anysat', 'anysat', pretrained=True, flash_attn=False)
|
44 |
+
|
45 |
```
|
46 |
+
Set `flash_attn=True` if you have [flash-attn](https://pypi.org/project/flash-attn/) module installed. It is not required and only impacts memory and speed.
|
47 |
+
|
48 |
+
## Format your data
|
49 |
+
|
50 |
+
Arrange your data in a dictionary with any of the following keys:
|
51 |
|
|
|
52 |
| Dataset | Description | Tensor Size | Channels | Resolution |
|
53 |
|---------------|-----------------------------------|-----------------------------------------|-------------------------------------------|------------|
|
54 |
| aerial | Single date tensor |Bx4xHxW | RGB, NiR | 0.2m |
|
55 |
| aerial-flair | Single date tensor |Bx5xHxW | RGB, NiR, Elevation | 0.2m |
|
56 |
| spot | Single date tensor |Bx3xHxW | RGB | 1m |
|
57 |
+
| naip | Single date tensor |Bx4xHxW | RGB | 1.25m |
|
58 |
+
| s2 | Time series tensor |BxTx10xHxW | B2, B3, B4, B5, B6, B7, B8, B8a, B11, B12 | 10m |
|
59 |
+
| s1-asc | Time series tensor |BxTx2xHxW | VV, VH | 10m |
|
60 |
| s1 | Time series tensor |BxTx3xHxW | VV, VH, Ratio | 10m |
|
61 |
| alos | Time series tensor |BxTx3xHxW | HH, HV, Ratio | 30m |
|
62 |
| l7 | Time series tensor |BxTx6xHxW | B1, B2, B3, B4, B5, B7 | 30m |
|
63 |
+
| l8 | Time series tensor |BxTx11xHxW | B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 | 10m |
|
64 |
| modis | Time series tensor |BxTx7xHxW | B1, B2, B3, B4, B5, B6, B7 | 250m |
|
65 |
+
|
66 |
+
Note that time series requires a `_dates` companion tensor containing the day of the year: 01/01 = 0, 31/12=364.
|
67 |
+
|
68 |
+
**Example Input** for a tile of 60x60m and a batch size of B:
|
69 |
+
|
70 |
+
```python
|
71 |
+
data = {
|
72 |
+
"aerial": ... #Tensor of size [B, 4, 300, 300] : 4 channels, 300x300 pixels at 20cm res
|
73 |
+
"spot": ... #Tensor of size [B, 3, 60, 60]: 3 channels, 60x60 pixels at 1m res
|
74 |
+
"s2": ... #Tensor of size [B, 12, 10, 6, 6] : 12 dates, 10 channels, 6x6 pixels at 10m res
|
75 |
+
"s2_dates": ... #Tensor of size [B, 12] : 12 dates
|
76 |
+
}
|
77 |
+
```
|
78 |
+
Ensure that the spatial extent of each modality multiplied by its resolution is consistent.
|
79 |
|
80 |
+
## Extract Features
|
81 |
|
82 |
+
Decide on:
|
83 |
+
- **Patch size** (in m, must be a multiple of 10): adjust according to the scale of your tiles and GPU memory. In general, avoid having more than 1024 patches per tile.
|
84 |
+
- **Output type**: Choose between:
|
85 |
+
- `'tile'`: Single vector per tile
|
86 |
+
- `'patch'`: A vector per patch
|
87 |
+
- `'dense'`: A vector per sub-patch
|
88 |
+
- `'all'`: a tuple with all three outputs
|
89 |
+
|
90 |
+
The sub patches are `1x1` pixels for time series and `10x10` pixels for VHR images. If using `output='dense'`, specify the `output_modality`.
|
91 |
|
92 |
+
Example use:
|
93 |
```python
|
94 |
+
features = AnySat(data, scale=10, output='tile') #tensor of size [D,]
|
95 |
+
features = AnySat(data, scale=10, output='patch') #tensor of size [D,6,6]
|
96 |
+
features = AnySat(data, scale=20, output='patch') #tensor of size [D,3,3]
|
97 |
+
features = AnySat(data, scale=20, output='dense', output_modality='aerial') #tensor of size [D,30,30]
|
98 |
```
|
99 |
+
**Explanation for the size of the dense map:** `d=10` for 'aerial' which has a 0.2m resolution, the sub-patches are 2x2 m.
|
100 |
|
101 |
+
# Advanced Installation
|
102 |
|
103 |
+
## Install from source
|
104 |
|
105 |
+
```bash
|
106 |
+
# clone project
|
107 |
+
git clone https://github.com/gastruc/anysat
|
108 |
+
cd anysat
|
109 |
+
|
110 |
+
# [OPTIONAL] create conda environment
|
111 |
+
conda create -n anysat python=3.9
|
112 |
+
conda activate anysat
|
113 |
+
|
114 |
+
# install requirements
|
115 |
+
pip install -r requirements.txt
|
116 |
+
|
117 |
+
# Create data folder where you can put your datasets
|
118 |
+
mkdir data
|
119 |
+
# Create logs folder
|
120 |
+
mkdir logs
|
121 |
+
```
|
122 |
+
|
123 |
+
## Run Locally
|
124 |
+
|
125 |
+
To load the model locally, you can use the following code:
|
126 |
```python
|
127 |
+
|
128 |
+
from hubconf import AnySat
|
129 |
+
|
130 |
+
AnySat = AnySat.from_pretrained('base', flash_attn=False) #Set flash_attn=True if you have flash-attn module installed
|
131 |
+
#For now, only base is available.
|
132 |
+
#device = "cuda" If you want to run on GPU default is cpu
|
133 |
```
|
134 |
|
135 |
+
Every experience of the paper has its config file. Feel free to explore `configs/exp` folder.
|
136 |
|
137 |
+
```bash
|
138 |
+
# Run AnySat pretraining on GeoPlex
|
139 |
+
python src/train.py exp=GeoPlex_AnySAT
|
140 |
|
141 |
+
# Run AnySat finetuning on BraDD-S1TS
|
142 |
+
python src/train.py exp=BraDD_AnySAT_FT
|
143 |
|
144 |
+
# Run AnySat linear probing on BraDD-S1TS
|
145 |
+
python src/train.py exp=BraDD_AnySAT_LP
|
146 |
+
```
|
147 |
|
148 |
+
# Supported Datasets
|
149 |
|
150 |
+
Our implementation already supports 9 datasets:
|
151 |
|
152 |
+
<p align="center">
|
153 |
+
<img src=".media/datasets.png" alt="AnySat Datasets" width="500">
|
154 |
+
</p>
|
155 |
|
156 |
+
## GeoPlex Datasets
|
157 |
+
|
158 |
+
1. **TreeSatAI-TS**
|
159 |
+
- **Description**: Multimodal dataset for tree species identification.
|
160 |
+
- **Extent**: 50,381 tiles covering 180 km² with multi-label annotations across 20 classes.
|
161 |
+
- **Modalities**: VHR images (0.2 m), Sentinel-2 time series, Sentinel-1 time series.
|
162 |
+
- **Tasks**: Tree species classification.
|
163 |
+
|
164 |
+
2. **PASTIS-HD**
|
165 |
+
- **Description**: Crop mapping dataset with delineated agricultural parcels.
|
166 |
+
- **Extent**: 2,433 tiles covering 3986 km² with annotations across 18 crop types.
|
167 |
+
- **Modalities**: SPOT6/7 VHR imagery (1.5 m), Sentinel-2 time series, Sentinel-1 time series.
|
168 |
+
- **Tasks**: Classification, semantic segmentation, panoptic segmentation.
|
169 |
+
|
170 |
+
3. **FLAIR**
|
171 |
+
- **Description**: Land cover dataset combining VHR aerial imagery with Sentinel-2 time series.
|
172 |
+
- **Extent**: 77,762 tiles covering 815 km² with annotations across 13 land cover classes.
|
173 |
+
- **Modalities**: VHR images (0.2 m), Sentinel-2 time series.
|
174 |
+
- **Tasks**: Land cover mapping.
|
175 |
+
|
176 |
+
4. **PLANTED**
|
177 |
+
- **Description**: Global forest dataset for tree species identification.
|
178 |
+
- **Extent**: 1,346,662 tiles covering 33,120 km² with annotations across 40 classes.
|
179 |
+
- **Modalities**: Sentinel-2, Landsat-7, MODIS, Sentinel-1, ALOS-2.
|
180 |
+
- **Tasks**: Tree species classification.
|
181 |
+
|
182 |
+
5. **S2NAIP-URBAN**
|
183 |
+
- **Description**: Urban dataset with high-resolution imagery and time series data.
|
184 |
+
- **Extent**: 515,270 tiles covering 211,063 km² with NAIP, Sentinel-2, Sentinel-1, and Landsat-8/9 data.
|
185 |
+
- **Modalities**: NAIP (1.25 m), Sentinel-2 time series, Sentinel-1 time series, Landsat-8/9.
|
186 |
+
- **Tasks**: Pretraining only (no official labels).
|
187 |
+
|
188 |
+
## External Evaluation Datasets
|
189 |
+
|
190 |
+
1. **BraDD-S1TS**
|
191 |
+
- **Description**: Change detection dataset for deforestation in the Amazon rainforest.
|
192 |
+
- **Extent**: 13,234 tiles with Sentinel-1 time series.
|
193 |
+
- **Tasks**: Change detection (deforestation segmentation).
|
194 |
+
|
195 |
+
2. **SICKLE**
|
196 |
+
- **Description**: Multimodal crop mapping dataset from India.
|
197 |
+
- **Extent**: 34,848 tiles with Sentinel-1, Sentinel-2, and Landsat-8 time series.
|
198 |
+
- **Tasks**: Crop type classification (paddy/non-paddy).
|
199 |
+
|
200 |
+
3. **TimeSen2Crop**
|
201 |
+
- **Description**: Crop mapping dataset from Slovenia.
|
202 |
+
- **Extent**: 1,212,224 single-pixel Sentinel-2 time series.
|
203 |
+
- **Tasks**: Crop type classification.
|
204 |
+
|
205 |
+
4. **Sen1Flood11**
|
206 |
+
- **Description**: Flood mapping dataset with global scope.
|
207 |
+
- **Extent**: 4.8K Sentinel-1/2 time series.
|
208 |
+
- **Tasks**: Flood classification (flooded/ not flooded).
|
209 |
+
|
210 |
+
# Reference
|
211 |
+
|
212 |
+
Please use the following bibtex:
|
213 |
+
```bibtex
|
214 |
+
@article{astruc2024anysat,
|
215 |
+
title={{AnySat: An Earth} Observation Model for Any Resolutions, Scales, and Modalities},
|
216 |
+
author={Astruc, Guillaume and Gonthier, Nicolas and Mallet, Clement and Landrieu, Loic},
|
217 |
+
journal={arXiv preprint arXiv:2412.XXXX},
|
218 |
+
year={2024}
|
219 |
+
}
|
220 |
```
|
221 |
+
|
222 |
+
# Acknowledgements
|
223 |
+
- The code is conducted on the same base as [OmniSat](https://github.com/gastruc/OmniSat)
|
224 |
+
- The JEPA implementation comes from [JEPA](https://github.com/facebookresearch/ijepa)
|
225 |
+
- The code from Pangaea datasets comes from [Pangaea](https://github.com/VMarsocci/pangaea-bench)
|
226 |
+
<br>
|