# AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation This repository contains the official implementation of the following paper: > **AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation**
> [Zhen Li](https://paper99.github.io/)^\*, [Zuo-Liang Zhu](https://nk-cs-zzl.github.io/)^\*, [Ling-Hao Han](https://scholar.google.com/citations?user=0ooNdgUAAAAJ&hl=en), [Qibin Hou](https://scholar.google.com/citations?hl=en&user=fF8OFV8AAAAJ&view_op=list_works), [Chun-Le Guo](https://scholar.google.com/citations?hl=en&user=RZLYwR0AAAAJ), [Ming-Ming Cheng](https://mmcheng.net/cmm)
> (\* denotes equal contribution)
> Nankai University
> In CVPR 2023
[[Paper](https://arxiv.org/abs/2304.09790)] [[Project Page](https://nk-cs-zzl.github.io/projects/amt/index.html)] [[Web demos](#web-demos)] [Video] AMT is a **lightweight, fast, and accurate** algorithm for Frame Interpolation. It aims to provide practical solutions for **video generation** from **a few given frames (at least two frames)**. ![Demo gif](assets/amt_demo.gif) * More examples can be found in our [project page](https://nk-cs-zzl.github.io/projects/amt/index.html). ## Web demos Integrated into [Hugging Face Spaces 🤗](https://huggingface.co/spaces) using [Gradio](https://github.com/gradio-app/gradio). Try out the Web Demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/NKU-AMT/AMT) Try AMT to interpolate between two or more images at [![PyTTI-Tools:FILM](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1IeVO5BmLouhRh6fL2z_y18kgubotoaBq?usp=sharing) ## Change Log - **Apr 20, 2023**: Our code is publicly available. ## Method Overview ![pipeline](https://user-images.githubusercontent.com/21050959/229420451-65951bd0-732c-4f09-9121-f291a3862d6e.png) For technical details, please refer to the [method.md](docs/method.md) file, or read the full report on [arXiv](https://arxiv.org/abs/2304.09790). ## Dependencies and Installation 1. Clone Repo ```bash git clone https://github.com/MCG-NKU/AMT.git ``` 2. Create Conda Environment and Install Dependencies ```bash conda env create -f environment.yaml conda activate amt ``` 3. Download pretrained models for demos from [Pretrained Models](#pretrained-models) and place them to the `pretrained` folder ## Quick Demo **Note that the selected pretrained model (`[CKPT_PATH]`) needs to match the config file (`[CFG]`).** > Creating a video demo, increasing $n$ will slow down the motion in the video. (With $m$ input frames, `[N_ITER]` $=n$ corresponds to $2^n\times (m-1)+1$ output frames.) ```bash python demos/demo_2x.py -c [CFG] -p [CKPT] -n [N_ITER] -i [INPUT] -o [OUT_PATH] -r [FRAME_RATE] # e.g. [INPUT] # -i could be a video / a regular expression / a folder contains multiple images # -i demo.mp4 (video)/img_*.png (regular expression)/img0.png img1.png (images)/demo_input (folder) # e.g. a simple usage python demos/demo_2x.py -c cfgs/AMT-S.yaml -p pretrained/amt-s.pth -n 6 -i assets/quick_demo/img0.png assets/quick_demo/img1.png ``` + Note: Please enable `--save_images` for saving the output images (Save speed will be slowed down if there are too many output images) + Input type supported: `a video` / `a regular expression` / `multiple images` / `a folder containing input frames`. + Results are in the `[OUT_PATH]` (default is `results/2x`) folder. ## Pretrained Models

Dataset	:link: Download Links	Config file	Trained on	Arbitrary/Fixed
AMT-S	[Google Driver][Baidu Cloud][Hugging Face]	[cfgs/AMT-S]	Vimeo90k	Fixed
AMT-L	[Google Driver][Baidu Cloud][Hugging Face]	[cfgs/AMT-L]	Vimeo90k	Fixed
AMT-G	[Google Driver][Baidu Cloud][Hugging Face]	[cfgs/AMT-G]	Vimeo90k	Fixed
AMT-S	[Google Driver][Baidu Cloud][Hugging Face]	[cfgs/AMT-S_gopro]	GoPro	Arbitrary

## Training and Evaluation Please refer to [develop.md](docs/develop.md) to learn how to benchmark the AMT and how to train a new AMT model from scratch. ## Citation If you find our repo useful for your research, please consider citing our paper: ```bibtex @inproceedings{licvpr23amt, title={AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation}, author={Li, Zhen and Zhu, Zuo-Liang and Han, Ling-Hao and Hou, Qibin and Guo, Chun-Le and Cheng, Ming-Ming}, booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2023} } ``` ## License This code is licensed under the [Creative Commons Attribution-NonCommercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/) for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use. ## Contact For technical questions, please contact `zhenli1031[AT]gmail.com` and `nkuzhuzl[AT]gmail.com`. For commercial licensing, please contact `cmm[AT]nankai.edu.cn` ## Acknowledgement We thank Jia-Wen Xiao, Zheng-Peng Duan, Rui-Qi Wu, and Xin Jin for proof reading. We thank [Zhewei Huang](https://github.com/hzwer) for his suggestions. Here are some great resources we benefit from: - [IFRNet](https://github.com/ltkong218/IFRNet) and [RIFE](https://github.com/megvii-research/ECCV2022-RIFE) for data processing, benchmarking, and loss designs. - [RAFT](https://github.com/princeton-vl/RAFT), [M2M-VFI](https://github.com/feinanshan/M2M_VFI), and [GMFlow](https://github.com/haofeixu/gmflow) for inspirations. - [FILM](https://github.com/google-research/frame-interpolation) for Web demo reference. **If you develop/use AMT in your projects, welcome to let us know. We will list your projects in this repository.** We also thank all of our contributors.