Fudan-FUXI
/

LiFT-Critic-13b-lora-v1.5

Video-Text-to-Text

Model card Files Files and versions Community

Fudan-FUXI commited on 10 days ago

Commit

4ca879f

•

1 Parent(s): 2fa8c47

Update README.md

Files changed (1) hide show

README.md +47 -3

README.md CHANGED Viewed

@@ -1,3 +1,47 @@
----
-license: mit
----

+---
+license: mit
+language:
+- en
+base_model:
+- Efficient-Large-Model/VILA1.5-13b
+pipeline_tag: video-text-to-text
+---
+# LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
+LiFT-Critic is a novel Video-Text-to-Text Reward Model for synthesized video evaluation.
+## 🔧 Installation
+1. Clone the github repository and navigate to LiFT folder
+```bash
+git clone https://github.com/CodeGoat24/LiFT.git
+cd LiFT
+```
+2. Install packages
+```
+bash ./environment_setup.sh lift
+```
+## 🚀 Inference
+### Run
+Please download this public [LiFT-Critic-13b-lora](https://huggingface.co/Fudan-FUXI/LiFT-Critic-13b-lora-v1.5) checkpoints.
+We provide some synthesized videos for quick inference in `./demo` directory.
+```bash
+python LiFT-Critic/test/run_critic_13b.py --model-path ./LiFT-Critic-13b-lora-v1.5
+```
+# 🖊️ Citation
+If you find our work helpful, please cite our paper.
+```bibtex
+@article{LiFT,
+  title={LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.},
+  author={Wang, Yibin and Tan, Zhiyu, and Wang, Junyan and Yang, Xiaomeng and Jin, Cheng and Li, Hao},
+  journal={arXiv preprint arXiv:2412.04814},
+  year={2024}
+}
+```