koukyo1994 commited on
Commit
3580aa5
·
verified ·
1 Parent(s): 2f55211

update README

Browse files
Files changed (1) hide show
  1. README.md +15 -3
README.md CHANGED
@@ -17,21 +17,33 @@ A key feature of Terra is its **high adherence to trajectory instructions**, ena
17
  ## Related Links
18
 
19
  For more technical details and discussions, please refer to:
20
- - **Paper:** To Be Updated
21
  - **Code:** https://github.com/turingmotors/ACT-Bench
22
- - **Blog Post:** [運転版の"Sora"を作る:動画生成の世界モデルTerraの開発背景](https://zenn.dev/turing_motors/articles/6c0ddc10aae542) (ja) / [Create a driving version of "Sora"](https://medium.com/@hide1996/create-a-driving-version-of-sora-33cf4040937a) (en)
23
 
24
  ## How to use
25
 
26
  We have verified the execution on a machine equipped with a single NVIDIA H100 80GB GPU. However, we believe it should be possible to run the model on any machine equipped with an NVIDIA GPU with 16GB or more of VRAM.
27
 
 
 
28
  ### Install Packages
29
 
 
30
 
 
 
 
 
31
 
32
  ### Action-Conditioned Video Generation without Video Refiner
33
 
34
- ### Action-Conditioned Video Generation with Video Refiner
 
 
 
 
 
 
35
 
36
  ## Citation
37
 
 
17
  ## Related Links
18
 
19
  For more technical details and discussions, please refer to:
20
+ - **Paper:** https://arxiv.org/abs/2412.05337
21
  - **Code:** https://github.com/turingmotors/ACT-Bench
 
22
 
23
  ## How to use
24
 
25
  We have verified the execution on a machine equipped with a single NVIDIA H100 80GB GPU. However, we believe it should be possible to run the model on any machine equipped with an NVIDIA GPU with 16GB or more of VRAM.
26
 
27
+ Terra consists of an Image Tokenizer, an Autoregressive Transformer, and a Video Refiner. Due to the complexity of setting up the Video Refiner, please refer to the [ACT-Bench repository](https://github.com/turingmotors/ACT-Bench) for detailed instructions. Here, we provide an example of generating video continuations using the Image Tokenizer and the Autoregressive Transformer, conditioned on image frames and a template trajectory. The resulting video quality might seem suboptimal as each frame is decoded individually. To improve the visual quality, you can use Video Refiner.
28
+
29
  ### Install Packages
30
 
31
+ We use [uv](https://docs.astral.sh/uv/) to manage python packages. If you don't have uv installed in your environment, please see the document of it.
32
 
33
+ ```shell
34
+ $ git clone https://huggingface.co/turing-motors/Terra
35
+ $ uv sync
36
+ ```
37
 
38
  ### Action-Conditioned Video Generation without Video Refiner
39
 
40
+ ```shell
41
+ $ python inference.py
42
+ ```
43
+
44
+ This command generates a video using three image frames located in ![`assets/conditioning_frames`](./assets/conditioning_frames/) and the `curving_to_left/curving_to_left_moderate` trajectory defined in the trajectory template file ![`assets/template_trajectory.json`](./assets/template_trajectory.json).
45
+
46
+ You can find more details by referring to the ![`inference.py`](./inference.py) script.
47
 
48
  ## Citation
49