feifeiobama
commited on
Commit
•
cb3e3c1
1
Parent(s):
f67483e
Add paper link
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ base_model:
|
|
6 |
|
7 |
# ⚡️Pyramid Flow⚡️
|
8 |
|
9 |
-
[[Paper]](https://arxiv.org) [[Project Page ✨]](https://pyramid-flow.github.io) [[Code
|
10 |
|
11 |
This is the official repository for Pyramid Flow, a training-efficient **Autoregressive Video Generation** method based on **Flow Matching**. By training only on open-source datasets, it generates high-quality 10-second videos at 768p resolution and 24 FPS, and naturally supports image-to-video generation.
|
12 |
|
@@ -26,12 +26,19 @@ This is the official repository for Pyramid Flow, a training-efficient **Autoreg
|
|
26 |
## News
|
27 |
|
28 |
* `COMING SOON` ⚡️⚡️⚡️ Training code and new model checkpoints trained from scratch.
|
29 |
-
* `2024.10.10` 🚀🚀🚀 We release the [technical report](https://arxiv.org), [project page](https://pyramid-flow.github.io) and [model checkpoint](https://huggingface.co/rain1011/pyramid-flow-sd3) of Pyramid Flow.
|
30 |
|
31 |
## Usage
|
32 |
|
33 |
You can directly download the model from [Huggingface](https://huggingface.co/rain1011/pyramid-flow-sd3). We provide both model checkpoints for 768p and 384p video generation. The 384p checkpoint supports 5-second video generation at 24FPS, while the 768p checkpoint supports up to 10-second video generation at 24FPS.
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
To use our model, please follow the inference code in `video_generation_demo.ipynb` at [this link](https://github.com/jy0205/Pyramid-Flow/blob/main/video_generation_demo.ipynb). We further simplify it into the following two-step procedure. First, load the downloaded model:
|
36 |
|
37 |
```python
|
@@ -44,7 +51,7 @@ torch.cuda.set_device(0)
|
|
44 |
model_dtype, torch_dtype = 'bf16', torch.bfloat16 # Use bf16, fp16 or fp32
|
45 |
|
46 |
model = PyramidDiTForVideoGeneration(
|
47 |
-
'
|
48 |
model_dtype,
|
49 |
model_variant='diffusion_transformer_768p', # 'diffusion_transformer_384p'
|
50 |
)
|
@@ -133,7 +140,7 @@ Consider giving this repository a star and cite Pyramid Flow in your publication
|
|
133 |
@article{jin2024pyramidal,
|
134 |
title={Pyramidal Flow Matching for Efficient Video Generative Modeling},
|
135 |
author={Jin, Yang and Sun, Zhicheng and Li, Ningyuan and Xu, Kun and Xu, Kun and Jiang, Hao and Zhuang, Nan and Huang, Quzhe and Song, Yang and Mu, Yadong and Lin, Zhouchen},
|
136 |
-
jounal={arXiv preprint arXiv:2410.
|
137 |
year={2024}
|
138 |
}
|
139 |
```
|
|
|
6 |
|
7 |
# ⚡️Pyramid Flow⚡️
|
8 |
|
9 |
+
[[Paper]](https://arxiv.org/abs/2410.05954) [[Project Page ✨]](https://pyramid-flow.github.io) [[Code 🚀]](https://github.com/jy0205/Pyramid-Flow)
|
10 |
|
11 |
This is the official repository for Pyramid Flow, a training-efficient **Autoregressive Video Generation** method based on **Flow Matching**. By training only on open-source datasets, it generates high-quality 10-second videos at 768p resolution and 24 FPS, and naturally supports image-to-video generation.
|
12 |
|
|
|
26 |
## News
|
27 |
|
28 |
* `COMING SOON` ⚡️⚡️⚡️ Training code and new model checkpoints trained from scratch.
|
29 |
+
* `2024.10.10` 🚀🚀🚀 We release the [technical report](https://arxiv.org/abs/2410.05954), [project page](https://pyramid-flow.github.io) and [model checkpoint](https://huggingface.co/rain1011/pyramid-flow-sd3) of Pyramid Flow.
|
30 |
|
31 |
## Usage
|
32 |
|
33 |
You can directly download the model from [Huggingface](https://huggingface.co/rain1011/pyramid-flow-sd3). We provide both model checkpoints for 768p and 384p video generation. The 384p checkpoint supports 5-second video generation at 24FPS, while the 768p checkpoint supports up to 10-second video generation at 24FPS.
|
34 |
|
35 |
+
```python
|
36 |
+
from huggingface_hub import snapshot_download
|
37 |
+
|
38 |
+
model_path = 'PATH' # The local directory to save downloaded checkpoint
|
39 |
+
snapshot_download("rain1011/pyramid-flow-sd3", local_dir=model_path, local_dir_use_symlinks=False, repo_type='model')
|
40 |
+
```
|
41 |
+
|
42 |
To use our model, please follow the inference code in `video_generation_demo.ipynb` at [this link](https://github.com/jy0205/Pyramid-Flow/blob/main/video_generation_demo.ipynb). We further simplify it into the following two-step procedure. First, load the downloaded model:
|
43 |
|
44 |
```python
|
|
|
51 |
model_dtype, torch_dtype = 'bf16', torch.bfloat16 # Use bf16, fp16 or fp32
|
52 |
|
53 |
model = PyramidDiTForVideoGeneration(
|
54 |
+
'PATH', # The downloaded checkpoint dir
|
55 |
model_dtype,
|
56 |
model_variant='diffusion_transformer_768p', # 'diffusion_transformer_384p'
|
57 |
)
|
|
|
140 |
@article{jin2024pyramidal,
|
141 |
title={Pyramidal Flow Matching for Efficient Video Generative Modeling},
|
142 |
author={Jin, Yang and Sun, Zhicheng and Li, Ningyuan and Xu, Kun and Xu, Kun and Jiang, Hao and Zhuang, Nan and Huang, Quzhe and Song, Yang and Mu, Yadong and Lin, Zhouchen},
|
143 |
+
jounal={arXiv preprint arXiv:2410.05954},
|
144 |
year={2024}
|
145 |
}
|
146 |
```
|