ychenhq commited on
Commit
4b970f5
β€’
1 Parent(s): e7356e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -82
README.md CHANGED
@@ -1,83 +1,10 @@
1
- ## VideoCraftXtend: AI-Enhanced Text-to-Video Generation with Extended Length and Enhanced Motion Smoothness
2
-
3
- <a href='https://huggingface.co/spaces/ychenhq/VideoCrafterXtend'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a>
4
-
5
- ------
6
-
7
- ## Introduction
8
- VideoCraftXtend is an open-source video generation and editing toolbox for crafting video content.
9
- This project aims to tackle challenges in T2V generation, specifically focusing on the production of long videos, enhancing motion smoothness quality and improving content diversity. We propose a comprehensive framework that integrates a T2V diffusion model, utilizes the OpenAI GPT API, incorporates a Video Quality Assessment (VQA) model, and refines an Interpolation model.
10
-
11
- ### 1. Generic Text-to-video Generation
12
- Click the GIF to access the high-resolution video.
13
-
14
- <table class="center">
15
- <td>
16
- <video width="320" controls>
17
- <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0001.mp4" type="video/mp4">
18
- Your browser does not support the video tag.
19
- </video>
20
- </td>
21
- <td>
22
- <video width="320" controls>
23
- <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0002.mp4" type="video/mp4">
24
- Your browser does not support the video tag.
25
- </video>
26
- </td>
27
- <td>
28
- <video width="320" controls>
29
- <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0003.mp4" type="video/mp4">
30
- Your browser does not support the video tag.
31
- </video>
32
- </td>
33
- <tr>
34
- <td style="text-align:center;" width="320">"There is a cat dancing on the sand."</td>
35
- <td style="text-align:center;" width="320">"Behold the mesmerizing sight of a cat elegantly dancing amidst the soft grains of sand."</td>
36
- <td style="text-align:center;" width="320">"The fluffy cat is joyfully prancing and twirling on the soft golden sand, its elegant movements mirroring the peaceful seaside setting."</td>
37
- <tr>
38
- </table >
39
-
40
-
41
- ## βš™οΈ Setup
42
-
43
- ### 1. Install Environment
44
- 1) Via Anaconda
45
- ```bash
46
- conda create -n videocraftxtend python=3.8.5
47
- conda activate videocraftxtend
48
- pip install -r requirements.txt
49
- ```
50
- 2) Using Google Colab Pro
51
-
52
- ### 2. Download the model checkpoints
53
- 1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter/VideoCrafter2/blob/main/model.ckpt), and put the `model.ckpt` in `VideoCrafter/checkpoints/base_512_v2/model.ckpt`.
54
- 2) Download pretrained Interpolation models viea [Google Drive](https://drive.google.com/drive/folders/1TBEwF2PmSGyDngP1anjNswlIfwGh2NzU?usp=sharing), and put the `flownet.pkl` in `VideoCrafter/ECCV2022-RIFE/train_log/flownet.pkl`.
55
-
56
- ## πŸ’« Inference
57
- ### 1. Text-to-Video local Gradio demo
58
- 1) Open `VideoCraftXtend.ipynb`, run the cells till generating Gradio Interface.
59
- 2) Input prompt, customize the parameters and get the resulting video
60
- 3) The last section of the file is evaluation results been put in our report)
61
- 4) Open the `VideoCraftXtend.ipynb` notebook and run the cells until you reach the point where the Gradio interface is generated.
62
- 5) Once the Gradio interface is generated, you can input prompts and customize the parameters according to your requirements. The resulting video should be generated within an estimated timeframe of 15-20 minutes.
63
- 6) The last section of `VideoCraftXtend.ipynb` contains the evaluation results that were included in our report.
64
-
65
-
66
  ---
67
- ## πŸ“‹ Techinical Report
68
- πŸ˜‰ VideoCrafter2 Tech report: [VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models](https://arxiv.org/abs/2401.09047)
69
-
70
-
71
- ## πŸ€— Acknowledgements
72
- Our codebase builds on
73
- 1) [Stable Diffusion](https://github.com/Stability-AI/stablediffusion)
74
- 2) [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter)
75
- 3) [UVQ](https://github.com/google/uvq)
76
- 4) [VBench](https://github.com/Vchitect/VBench)
77
- 5) [RIFE](https://github.com/hzwer/ECCV2022-RIFE)
78
- Thanks the authors for sharing their codebases!
79
-
80
-
81
- ## πŸ“’ Disclaimer
82
- We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.
83
- ****
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: VideoCrafterXen
3
+ emoji: πŸ‘
4
+ colorFrom: gray
5
+ colorTo: yellow
6
+ sdk: gradio
7
+ sdk_version: 4.27.0
8
+ app_file: app.py
9
+ pinned: false
10
+ ---