ychenhq commited on
Commit
e7356e6
β€’
1 Parent(s): 2d73919

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -10
README.md CHANGED
@@ -1,12 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: VideoCrafterXen
3
- emoji: πŸ‘
4
- colorFrom: gray
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 4.27.0
8
- app_file: app.py
9
- pinned: false
10
- ---
 
 
 
 
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
1
+ ## VideoCraftXtend: AI-Enhanced Text-to-Video Generation with Extended Length and Enhanced Motion Smoothness
2
+
3
+ <a href='https://huggingface.co/spaces/ychenhq/VideoCrafterXtend'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a>
4
+
5
+ ------
6
+
7
+ ## Introduction
8
+ VideoCraftXtend is an open-source video generation and editing toolbox for crafting video content.
9
+ This project aims to tackle challenges in T2V generation, specifically focusing on the production of long videos, enhancing motion smoothness quality and improving content diversity. We propose a comprehensive framework that integrates a T2V diffusion model, utilizes the OpenAI GPT API, incorporates a Video Quality Assessment (VQA) model, and refines an Interpolation model.
10
+
11
+ ### 1. Generic Text-to-video Generation
12
+ Click the GIF to access the high-resolution video.
13
+
14
+ <table class="center">
15
+ <td>
16
+ <video width="320" controls>
17
+ <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0001.mp4" type="video/mp4">
18
+ Your browser does not support the video tag.
19
+ </video>
20
+ </td>
21
+ <td>
22
+ <video width="320" controls>
23
+ <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0002.mp4" type="video/mp4">
24
+ Your browser does not support the video tag.
25
+ </video>
26
+ </td>
27
+ <td>
28
+ <video width="320" controls>
29
+ <source src="https://github.com/chloeleehn/VideoCraftXtend/blob/main/VideoCrafter/results/cat/0003.mp4" type="video/mp4">
30
+ Your browser does not support the video tag.
31
+ </video>
32
+ </td>
33
+ <tr>
34
+ <td style="text-align:center;" width="320">"There is a cat dancing on the sand."</td>
35
+ <td style="text-align:center;" width="320">"Behold the mesmerizing sight of a cat elegantly dancing amidst the soft grains of sand."</td>
36
+ <td style="text-align:center;" width="320">"The fluffy cat is joyfully prancing and twirling on the soft golden sand, its elegant movements mirroring the peaceful seaside setting."</td>
37
+ <tr>
38
+ </table >
39
+
40
+
41
+ ## βš™οΈ Setup
42
+
43
+ ### 1. Install Environment
44
+ 1) Via Anaconda
45
+ ```bash
46
+ conda create -n videocraftxtend python=3.8.5
47
+ conda activate videocraftxtend
48
+ pip install -r requirements.txt
49
+ ```
50
+ 2) Using Google Colab Pro
51
+
52
+ ### 2. Download the model checkpoints
53
+ 1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter/VideoCrafter2/blob/main/model.ckpt), and put the `model.ckpt` in `VideoCrafter/checkpoints/base_512_v2/model.ckpt`.
54
+ 2) Download pretrained Interpolation models viea [Google Drive](https://drive.google.com/drive/folders/1TBEwF2PmSGyDngP1anjNswlIfwGh2NzU?usp=sharing), and put the `flownet.pkl` in `VideoCrafter/ECCV2022-RIFE/train_log/flownet.pkl`.
55
+
56
+ ## πŸ’« Inference
57
+ ### 1. Text-to-Video local Gradio demo
58
+ 1) Open `VideoCraftXtend.ipynb`, run the cells till generating Gradio Interface.
59
+ 2) Input prompt, customize the parameters and get the resulting video
60
+ 3) The last section of the file is evaluation results been put in our report)
61
+ 4) Open the `VideoCraftXtend.ipynb` notebook and run the cells until you reach the point where the Gradio interface is generated.
62
+ 5) Once the Gradio interface is generated, you can input prompts and customize the parameters according to your requirements. The resulting video should be generated within an estimated timeframe of 15-20 minutes.
63
+ 6) The last section of `VideoCraftXtend.ipynb` contains the evaluation results that were included in our report.
64
+
65
+
66
  ---
67
+ ## πŸ“‹ Techinical Report
68
+ πŸ˜‰ VideoCrafter2 Tech report: [VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models](https://arxiv.org/abs/2401.09047)
69
+
70
+
71
+ ## πŸ€— Acknowledgements
72
+ Our codebase builds on
73
+ 1) [Stable Diffusion](https://github.com/Stability-AI/stablediffusion)
74
+ 2) [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter)
75
+ 3) [UVQ](https://github.com/google/uvq)
76
+ 4) [VBench](https://github.com/Vchitect/VBench)
77
+ 5) [RIFE](https://github.com/hzwer/ECCV2022-RIFE)
78
+ Thanks the authors for sharing their codebases!
79
+
80
 
81
+ ## πŸ“’ Disclaimer
82
+ We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.
83
+ ****