g-astruc
/

AnySat

Model card Files Files and versions Community

AnySat / README.md

canmike

Fix demo notebook link

8353c9d verified 9 days ago

preview code

raw

history blame

10.5 kB

	---
	license: mit
	---

	# AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities

	[Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)

	For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html).

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/m1IY9HfCD8NAeykZxqWSb.png" alt="AnySat Architecture" width="500">
	</p>

	# Abstract

	AnySat is a versatile Earth Observation model designed to handle diverse data across resolutions, scales, and modalities. Using a scale-adaptive joint embedding predictive architecture (JEPA), AnySat can train in a self-supervised manner on highly heterogeneous datasets.

	We train a single AnySat model on GeoPlex, a collection of 5 multimodal datasets spanning 11 sensors with varying characteristics. In fine-tuning or linear probing, AnySat achieves SOTA or near-SOTA performance for land cover segmentation, crop type classification, change detection, tree species identification, and flood mapping.

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/mENiGjg5gfKH27vqr8xuB.png" alt="AnySat Teaser" width="500">
	</p>

	# Key Features
	- 🌍 Versatile Model:Handles diverse datasets with resolutions spanning 3–11 channels, tiles ranging from 0.3 to 2600 hectares, and any combination of 11 sensors.
	- 🚀 Simple to Use: Install and download AnySat with a single line of code, select your desired modalities and patch size, and immediately generate rich features.
	- 🦋 Flexible Task Adaptation: Supports fine-tuning and linear probing for tasks like tile-wise classification and semantic segmentation.
	- 🧑‍🎓 Multi-dataset Training: Trains a single model across multiple datasets with varying characteristics.


	# 🚀 Quickstart

	Check out our [demo notebook](https://github.com/gastruc/AnySat/blob/main/demo.ipynb) or [huggingface page](https://huggingface.co/gastruc/anysat) for more details.

	## Install and load Anysat

	```python

	import torch

	AnySat = torch.hub.load('gastruc/anysat', 'anysat', pretrained=True, flash_attn=False)

	```
	Set `flash_attn=True` if you have [flash-attn](https://pypi.org/project/flash-attn/) module installed. It is not required and only impacts memory and speed.

	## Format your data

	Arrange your data in a dictionary with any of the following keys:

	\| Dataset \| Description \| Tensor Size \| Channels \| Resolution \|
	\|---------------\|-----------------------------------\|-----------------------------------------\|-------------------------------------------\|------------\|
	\| aerial \| Single date tensor \|Bx4xHxW \| RGB, NiR \| 0.2m \|
	\| aerial-flair \| Single date tensor \|Bx5xHxW \| RGB, NiR, Elevation \| 0.2m \|
	\| spot \| Single date tensor \|Bx3xHxW \| RGB \| 1m \|
	\| naip \| Single date tensor \|Bx4xHxW \| RGB \| 1.25m \|
	\| s2 \| Time series tensor \|BxTx10xHxW \| B2, B3, B4, B5, B6, B7, B8, B8a, B11, B12 \| 10m \|
	\| s1-asc \| Time series tensor \|BxTx2xHxW \| VV, VH \| 10m \|
	\| s1 \| Time series tensor \|BxTx3xHxW \| VV, VH, Ratio \| 10m \|
	\| alos \| Time series tensor \|BxTx3xHxW \| HH, HV, Ratio \| 30m \|
	\| l7 \| Time series tensor \|BxTx6xHxW \| B1, B2, B3, B4, B5, B7 \| 30m \|
	\| l8 \| Time series tensor \|BxTx11xHxW \| B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 \| 10m \|
	\| modis \| Time series tensor \|BxTx7xHxW \| B1, B2, B3, B4, B5, B6, B7 \| 250m \|

	Note that time series requires a `_dates` companion tensor containing the day of the year: 01/01 = 0, 31/12=364.

	Example Input for a tile of 60x60m and a batch size of B:

	```python
	data = {
	"aerial": ... #Tensor of size [B, 4, 300, 300] : 4 channels, 300x300 pixels at 20cm res
	"spot": ... #Tensor of size [B, 3, 60, 60]: 3 channels, 60x60 pixels at 1m res
	"s2": ... #Tensor of size [B, 12, 10, 6, 6] : 12 dates, 10 channels, 6x6 pixels at 10m res
	"s2_dates": ... #Tensor of size [B, 12] : 12 dates
	}
	```
	Ensure that the spatial extent of each modality multiplied by its resolution is consistent.

	## Extract Features

	Decide on:
	- Patch size (in m, must be a multiple of 10): adjust according to the scale of your tiles and GPU memory. In general, avoid having more than 1024 patches per tile.
	- Output type: Choose between:
	- `'tile'`: Single vector per tile
	- `'patch'`: A vector per patch
	- `'dense'`: A vector per sub-patch
	- `'all'`: a tuple with all three outputs

	The sub patches are `1x1` pixels for time series and `10x10` pixels for VHR images. If using `output='dense'`, specify the `output_modality`.

	Example use:
	```python
	features = AnySat(data, scale=10, output='tile') #tensor of size [D,]
	features = AnySat(data, scale=10, output='patch') #tensor of size [D,6,6]
	features = AnySat(data, scale=20, output='patch') #tensor of size [D,3,3]
	features = AnySat(data, scale=20, output='dense', output_modality='aerial') #tensor of size [D,30,30]
	```
	Explanation for the size of the dense map: `d=10` for 'aerial' which has a 0.2m resolution, the sub-patches are 2x2 m.

	# Advanced Installation

	## Install from source

	```bash
	# clone project
	git clone https://github.com/gastruc/anysat
	cd anysat

	# [OPTIONAL] create conda environment
	conda create -n anysat python=3.9
	conda activate anysat

	# install requirements
	pip install -r requirements.txt

	# Create data folder where you can put your datasets
	mkdir data
	# Create logs folder
	mkdir logs
	```

	## Run Locally

	To load the model locally, you can use the following code:
	```python

	from hubconf import AnySat

	AnySat = AnySat.from_pretrained('base', flash_attn=False) #Set flash_attn=True if you have flash-attn module installed
	#For now, only base is available.
	#device = "cuda" If you want to run on GPU default is cpu
	```

	Every experience of the paper has its config file. Feel free to explore `configs/exp` folder.

	```bash
	# Run AnySat pretraining on GeoPlex
	python src/train.py exp=GeoPlex_AnySAT

	# Run AnySat finetuning on BraDD-S1TS
	python src/train.py exp=BraDD_AnySAT_FT

	# Run AnySat linear probing on BraDD-S1TS
	python src/train.py exp=BraDD_AnySAT_LP
	```

	# Supported Datasets

	Our implementation already supports 9 datasets:

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/nGGz8kiDdeTJqIPSdrlSz.png" alt="AnySat Datasets" width="500">
	</p>

	## GeoPlex Datasets

	1. TreeSatAI-TS
	- Description: Multimodal dataset for tree species identification.
	- Extent: 50,381 tiles covering 180 km² with multi-label annotations across 20 classes.
	- Modalities: VHR images (0.2 m), Sentinel-2 time series, Sentinel-1 time series.
	- Tasks: Tree species classification.

	2. PASTIS-HD
	- Description: Crop mapping dataset with delineated agricultural parcels.
	- Extent: 2,433 tiles covering 3986 km² with annotations across 18 crop types.
	- Modalities: SPOT6/7 VHR imagery (1.5 m), Sentinel-2 time series, Sentinel-1 time series.
	- Tasks: Classification, semantic segmentation, panoptic segmentation.

	3. FLAIR
	- Description: Land cover dataset combining VHR aerial imagery with Sentinel-2 time series.
	- Extent: 77,762 tiles covering 815 km² with annotations across 13 land cover classes.
	- Modalities: VHR images (0.2 m), Sentinel-2 time series.
	- Tasks: Land cover mapping.

	4. PLANTED
	- Description: Global forest dataset for tree species identification.
	- Extent: 1,346,662 tiles covering 33,120 km² with annotations across 40 classes.
	- Modalities: Sentinel-2, Landsat-7, MODIS, Sentinel-1, ALOS-2.
	- Tasks: Tree species classification.

	5. S2NAIP-URBAN
	- Description: Urban dataset with high-resolution imagery and time series data.
	- Extent: 515,270 tiles covering 211,063 km² with NAIP, Sentinel-2, Sentinel-1, and Landsat-8/9 data.
	- Modalities: NAIP (1.25 m), Sentinel-2 time series, Sentinel-1 time series, Landsat-8/9.
	- Tasks: Pretraining only (no official labels).

	## External Evaluation Datasets

	1. BraDD-S1TS
	- Description: Change detection dataset for deforestation in the Amazon rainforest.
	- Extent: 13,234 tiles with Sentinel-1 time series.
	- Tasks: Change detection (deforestation segmentation).

	2. SICKLE
	- Description: Multimodal crop mapping dataset from India.
	- Extent: 34,848 tiles with Sentinel-1, Sentinel-2, and Landsat-8 time series.
	- Tasks: Crop type classification (paddy/non-paddy).

	3. TimeSen2Crop
	- Description: Crop mapping dataset from Slovenia.
	- Extent: 1,212,224 single-pixel Sentinel-2 time series.
	- Tasks: Crop type classification.

	4. Sen1Flood11
	- Description: Flood mapping dataset with global scope.
	- Extent: 4.8K Sentinel-1/2 time series.
	- Tasks: Flood classification (flooded/ not flooded).

	# Reference

	Please use the following bibtex:
	```bibtex
	@article{astruc2024anysat,
	title={{AnySat: An Earth} Observation Model for Any Resolutions, Scales, and Modalities},
	author={Astruc, Guillaume and Gonthier, Nicolas and Mallet, Clement and Landrieu, Loic},
	journal={arXiv preprint arXiv:2412.14123},
	year={2024}
	}
	```