g-astruc commited on
Commit
48b72b5
1 Parent(s): f56149c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +176 -31
README.md CHANGED
@@ -2,80 +2,225 @@
2
  license: mit
3
  ---
4
 
5
- # AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities (ArXiv 2024)
6
 
7
  [Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)
8
 
 
 
9
  <p align="center">
10
- <img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/Jh9eOnMePFiL84TOzhe86.png" alt="image/png" width="600" height="300">
11
  </p>
12
 
13
- ## Abstract
14
 
15
- We introduce AnySat: a JEPA-based multimodal Earth Observation model that train simultaneously on diverse datasets with different scales, resolutions (spatial, spectral, temporal), and modality combinations.
16
 
17
- For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html).
 
 
18
 
19
  <p align="center">
20
- <img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/2tc0cFdOF2V0_KgptA-qV.png" alt="image/png" width="400" height="200">
21
  </p>
22
 
23
- ### Inference 🔥
 
 
 
 
24
 
25
- In order to load our pretrained models, you can run:
 
 
 
 
 
26
 
27
  ```python
28
- from models.huggingface import AnySat
29
 
30
- ## Code to use pretrained weights
31
- model = AnySat(size="base", pretrained=True) #Exists also "small" and "tiny"
 
 
32
  ```
 
 
 
 
 
33
 
34
- To get features from an observation of a batch of observations, you need to provide to the model a dictionnary where keys are from the list:
35
  | Dataset | Description | Tensor Size | Channels | Resolution |
36
  |---------------|-----------------------------------|-----------------------------------------|-------------------------------------------|------------|
37
  | aerial | Single date tensor |Bx4xHxW | RGB, NiR | 0.2m |
38
  | aerial-flair | Single date tensor |Bx5xHxW | RGB, NiR, Elevation | 0.2m |
39
  | spot | Single date tensor |Bx3xHxW | RGB | 1m |
40
- | naip | Single date tensor |Bx4xHxW | RGB | 1.25m |
41
- | s2 | Time series tensor |BxTx10xHxW | B2, B3, B4, B5, B6, B7, B8, B8a, B11, B12 | 10m |
42
- | s1-asc | Time series tensor |BxTx2xHxW | VV, VH | 10m |
43
  | s1 | Time series tensor |BxTx3xHxW | VV, VH, Ratio | 10m |
44
  | alos | Time series tensor |BxTx3xHxW | HH, HV, Ratio | 30m |
45
  | l7 | Time series tensor |BxTx6xHxW | B1, B2, B3, B4, B5, B7 | 30m |
46
- | l8 | Time series tensor |BxTx11xHxW | B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 | 10m |
47
  | modis | Time series tensor |BxTx7xHxW | B1, B2, B3, B4, B5, B6, B7 | 250m |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
 
49
 
50
- Time series keys require a "{key}_dates" (for example "s2_dates") tensor of size BxT that value an integer that represent the day of the year.
51
- Then you have to choose at which scale you want te produce features. Scale argument is in meters and represent the size of the desired patch size.
52
- Outputs will be composed of the concatenation of a class token and a flattened feature map where each feature encodes a scale x scale zone.
53
- Scale should divide the spatial cover of all modalities and be a multiple of 10.
54
- Then, you can run:
 
 
 
 
55
 
 
56
  ```python
57
- features = AnySat(data, scale=scale) #where scale is the size in meters of patches
 
 
 
58
  ```
 
59
 
60
- And then you can apply those features to the desired downstream task!
61
 
62
- If you want to get a feature map at the density of a specific modality you can specify:
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ```python
65
- features = AnySat(data, scale=scale, keep_subpatch=True, modality_keep=modality) #where modality is the name of the desired modality
 
 
 
 
 
66
  ```
67
 
68
- Note that the features will be of size 2*D. If you have several modalities of the same desired resolution, you should pick the most informative one (or modify the code to concatenate also the other modalities)
69
 
 
 
 
70
 
71
- Example of use of AnySat:
72
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/_x2ng-3c0jvLIP3R5WEwA.png)
73
 
 
 
 
74
 
75
- To reproduce results, add new modalities, or do more experiments see the full code on [github]('https://github.com/gastruc/AnySat').
76
 
77
- ### Citing 💫
78
 
79
- ```bibtex
 
 
80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  ```
 
 
 
 
 
 
 
2
  license: mit
3
  ---
4
 
5
+ # AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
6
 
7
  [Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)
8
 
9
+ For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html).
10
+
11
  <p align="center">
12
+ <img src=".media/image.png" alt="AnySat Architecture" width="500">
13
  </p>
14
 
 
15
 
16
+ # Abstract
17
 
18
+ **AnySat** is a versatile Earth Observation model designed to handle diverse data across resolutions, scales, and modalities. Using a **scale-adaptive joint embedding predictive architecture** (JEPA), AnySat can train in a self-supervised manner on highly heterogeneous datasets.
19
+
20
+ We train a single AnySat model on **GeoPlex**, a collection of 5 multimodal datasets spanning 11 sensors with varying characteristics. In fine-tuning or linear probing, AnySat achieves SOTA or near-SOTA performance for land cover segmentation, crop type classification, change detection, tree species identification, and flood mapping.
21
 
22
  <p align="center">
23
+ <img src=".media/teaser.png" alt="AnySat Teaser" width="500">
24
  </p>
25
 
26
+ # Key Features
27
+ - 🌍 **Versatile Model**:Handles diverse datasets with resolutions spanning **3–11 channels**, tiles ranging from **0.3 to 2600 hectares**, and any combination of **11 sensors**.
28
+ - 🚀 **Simple to Use**: Install and download AnySat with a single line of code, select your desired modalities and patch size, and immediately generate rich features.
29
+ - 🦋 **Flexible Task Adaptation**: Supports fine-tuning and linear probing for tasks like **tile-wise classification** and **semantic segmentation**.
30
+ - 🧑‍🎓 **Multi-dataset Training**: Trains a single model across multiple datasets with varying characteristics.
31
 
32
+
33
+ # 🚀  Quickstart
34
+
35
+ Check out our [demo notebook](demo.ipynb) or [huggingface page](https://huggingface.co/gastruc/anysat) for more details.
36
+
37
+ ## Install and load Anysat
38
 
39
  ```python
 
40
 
41
+ import torch
42
+
43
+ AnySat = torch.hub.load('gastruc/anysat', 'anysat', pretrained=True, flash_attn=False)
44
+
45
  ```
46
+ Set `flash_attn=True` if you have [flash-attn](https://pypi.org/project/flash-attn/) module installed. It is not required and only impacts memory and speed.
47
+
48
+ ## Format your data
49
+
50
+ Arrange your data in a dictionary with any of the following keys:
51
 
 
52
  | Dataset | Description | Tensor Size | Channels | Resolution |
53
  |---------------|-----------------------------------|-----------------------------------------|-------------------------------------------|------------|
54
  | aerial | Single date tensor |Bx4xHxW | RGB, NiR | 0.2m |
55
  | aerial-flair | Single date tensor |Bx5xHxW | RGB, NiR, Elevation | 0.2m |
56
  | spot | Single date tensor |Bx3xHxW | RGB | 1m |
57
+ | naip | Single date tensor |Bx4xHxW | RGB | 1.25m |
58
+ | s2 | Time series tensor |BxTx10xHxW | B2, B3, B4, B5, B6, B7, B8, B8a, B11, B12 | 10m |
59
+ | s1-asc | Time series tensor |BxTx2xHxW | VV, VH | 10m |
60
  | s1 | Time series tensor |BxTx3xHxW | VV, VH, Ratio | 10m |
61
  | alos | Time series tensor |BxTx3xHxW | HH, HV, Ratio | 30m |
62
  | l7 | Time series tensor |BxTx6xHxW | B1, B2, B3, B4, B5, B7 | 30m |
63
+ | l8 | Time series tensor |BxTx11xHxW | B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 | 10m |
64
  | modis | Time series tensor |BxTx7xHxW | B1, B2, B3, B4, B5, B6, B7 | 250m |
65
+
66
+ Note that time series requires a `_dates` companion tensor containing the day of the year: 01/01 = 0, 31/12=364.
67
+
68
+ **Example Input** for a tile of 60x60m and a batch size of B:
69
+
70
+ ```python
71
+ data = {
72
+ "aerial": ... #Tensor of size [B, 4, 300, 300] : 4 channels, 300x300 pixels at 20cm res
73
+ "spot": ... #Tensor of size [B, 3, 60, 60]: 3 channels, 60x60 pixels at 1m res
74
+ "s2": ... #Tensor of size [B, 12, 10, 6, 6] : 12 dates, 10 channels, 6x6 pixels at 10m res
75
+ "s2_dates": ... #Tensor of size [B, 12] : 12 dates
76
+ }
77
+ ```
78
+ Ensure that the spatial extent of each modality multiplied by its resolution is consistent.
79
 
80
+ ## Extract Features
81
 
82
+ Decide on:
83
+ - **Patch size** (in m, must be a multiple of 10): adjust according to the scale of your tiles and GPU memory. In general, avoid having more than 1024 patches per tile.
84
+ - **Output type**: Choose between:
85
+ - `'tile'`: Single vector per tile
86
+ - `'patch'`: A vector per patch
87
+ - `'dense'`: A vector per sub-patch
88
+ - `'all'`: a tuple with all three outputs
89
+
90
+ The sub patches are `1x1` pixels for time series and `10x10` pixels for VHR images. If using `output='dense'`, specify the `output_modality`.
91
 
92
+ Example use:
93
  ```python
94
+ features = AnySat(data, scale=10, output='tile') #tensor of size [D,]
95
+ features = AnySat(data, scale=10, output='patch') #tensor of size [D,6,6]
96
+ features = AnySat(data, scale=20, output='patch') #tensor of size [D,3,3]
97
+ features = AnySat(data, scale=20, output='dense', output_modality='aerial') #tensor of size [D,30,30]
98
  ```
99
+ **Explanation for the size of the dense map:** `d=10` for 'aerial' which has a 0.2m resolution, the sub-patches are 2x2 m.
100
 
101
+ # Advanced Installation
102
 
103
+ ## Install from source
104
 
105
+ ```bash
106
+ # clone project
107
+ git clone https://github.com/gastruc/anysat
108
+ cd anysat
109
+
110
+ # [OPTIONAL] create conda environment
111
+ conda create -n anysat python=3.9
112
+ conda activate anysat
113
+
114
+ # install requirements
115
+ pip install -r requirements.txt
116
+
117
+ # Create data folder where you can put your datasets
118
+ mkdir data
119
+ # Create logs folder
120
+ mkdir logs
121
+ ```
122
+
123
+ ## Run Locally
124
+
125
+ To load the model locally, you can use the following code:
126
  ```python
127
+
128
+ from hubconf import AnySat
129
+
130
+ AnySat = AnySat.from_pretrained('base', flash_attn=False) #Set flash_attn=True if you have flash-attn module installed
131
+ #For now, only base is available.
132
+ #device = "cuda" If you want to run on GPU default is cpu
133
  ```
134
 
135
+ Every experience of the paper has its config file. Feel free to explore `configs/exp` folder.
136
 
137
+ ```bash
138
+ # Run AnySat pretraining on GeoPlex
139
+ python src/train.py exp=GeoPlex_AnySAT
140
 
141
+ # Run AnySat finetuning on BraDD-S1TS
142
+ python src/train.py exp=BraDD_AnySAT_FT
143
 
144
+ # Run AnySat linear probing on BraDD-S1TS
145
+ python src/train.py exp=BraDD_AnySAT_LP
146
+ ```
147
 
148
+ # Supported Datasets
149
 
150
+ Our implementation already supports 9 datasets:
151
 
152
+ <p align="center">
153
+ <img src=".media/datasets.png" alt="AnySat Datasets" width="500">
154
+ </p>
155
 
156
+ ## GeoPlex Datasets
157
+
158
+ 1. **TreeSatAI-TS**
159
+ - **Description**: Multimodal dataset for tree species identification.
160
+ - **Extent**: 50,381 tiles covering 180 km² with multi-label annotations across 20 classes.
161
+ - **Modalities**: VHR images (0.2 m), Sentinel-2 time series, Sentinel-1 time series.
162
+ - **Tasks**: Tree species classification.
163
+
164
+ 2. **PASTIS-HD**
165
+ - **Description**: Crop mapping dataset with delineated agricultural parcels.
166
+ - **Extent**: 2,433 tiles covering 3986 km² with annotations across 18 crop types.
167
+ - **Modalities**: SPOT6/7 VHR imagery (1.5 m), Sentinel-2 time series, Sentinel-1 time series.
168
+ - **Tasks**: Classification, semantic segmentation, panoptic segmentation.
169
+
170
+ 3. **FLAIR**
171
+ - **Description**: Land cover dataset combining VHR aerial imagery with Sentinel-2 time series.
172
+ - **Extent**: 77,762 tiles covering 815 km² with annotations across 13 land cover classes.
173
+ - **Modalities**: VHR images (0.2 m), Sentinel-2 time series.
174
+ - **Tasks**: Land cover mapping.
175
+
176
+ 4. **PLANTED**
177
+ - **Description**: Global forest dataset for tree species identification.
178
+ - **Extent**: 1,346,662 tiles covering 33,120 km² with annotations across 40 classes.
179
+ - **Modalities**: Sentinel-2, Landsat-7, MODIS, Sentinel-1, ALOS-2.
180
+ - **Tasks**: Tree species classification.
181
+
182
+ 5. **S2NAIP-URBAN**
183
+ - **Description**: Urban dataset with high-resolution imagery and time series data.
184
+ - **Extent**: 515,270 tiles covering 211,063 km² with NAIP, Sentinel-2, Sentinel-1, and Landsat-8/9 data.
185
+ - **Modalities**: NAIP (1.25 m), Sentinel-2 time series, Sentinel-1 time series, Landsat-8/9.
186
+ - **Tasks**: Pretraining only (no official labels).
187
+
188
+ ## External Evaluation Datasets
189
+
190
+ 1. **BraDD-S1TS**
191
+ - **Description**: Change detection dataset for deforestation in the Amazon rainforest.
192
+ - **Extent**: 13,234 tiles with Sentinel-1 time series.
193
+ - **Tasks**: Change detection (deforestation segmentation).
194
+
195
+ 2. **SICKLE**
196
+ - **Description**: Multimodal crop mapping dataset from India.
197
+ - **Extent**: 34,848 tiles with Sentinel-1, Sentinel-2, and Landsat-8 time series.
198
+ - **Tasks**: Crop type classification (paddy/non-paddy).
199
+
200
+ 3. **TimeSen2Crop**
201
+ - **Description**: Crop mapping dataset from Slovenia.
202
+ - **Extent**: 1,212,224 single-pixel Sentinel-2 time series.
203
+ - **Tasks**: Crop type classification.
204
+
205
+ 4. **Sen1Flood11**
206
+ - **Description**: Flood mapping dataset with global scope.
207
+ - **Extent**: 4.8K Sentinel-1/2 time series.
208
+ - **Tasks**: Flood classification (flooded/ not flooded).
209
+
210
+ # Reference
211
+
212
+ Please use the following bibtex:
213
+ ```bibtex
214
+ @article{astruc2024anysat,
215
+ title={{AnySat: An Earth} Observation Model for Any Resolutions, Scales, and Modalities},
216
+ author={Astruc, Guillaume and Gonthier, Nicolas and Mallet, Clement and Landrieu, Loic},
217
+ journal={arXiv preprint arXiv:2412.XXXX},
218
+ year={2024}
219
+ }
220
  ```
221
+
222
+ # Acknowledgements
223
+ - The code is conducted on the same base as [OmniSat](https://github.com/gastruc/OmniSat)
224
+ - The JEPA implementation comes from [JEPA](https://github.com/facebookresearch/ijepa)
225
+ - The code from Pangaea datasets comes from [Pangaea](https://github.com/VMarsocci/pangaea-bench)
226
+ <br>