metadata

license: cc-by-nc-sa-4.0
pipeline_tag: Image-to-Video
tags:
  - turing
  - autonomous driving
  - video generation
  - world model

Terra

Terra is a world model designed for autonomous driving and serves as a baseline model in th ACT-Bench framework. Terra generates video continuations based on short video clips of approximately three frames and trajectory instructions. A key feature of Terra is its high adherence to trajectory instructions, enabling accurate and reliable action-conditioned video generation.

How to use

We have verified the execution on a machine equipped with a single NVIDIA H100 80GB GPU. However, we believe it should be possible to run the model on any machine equipped with an NVIDIA GPU with 16GB or more of VRAM.

Install Packages

Action-Conditioned Video Generation without Video Refiner

Action-Conditioned Video Generation with Video Refiner

Citation

@misc{arai2024actbench,
      title={ACT-Bench: Towards Action Controllable World Models for Autonomous Driving}, 
      author={Hidehisa Arai and Keishi Ishihara and Tsubasa Takahashi and Yu Yamaguchi},
      year={2024},
      eprint={2412.05337},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.05337}, 
}

turing-motors
/

Terra