latent-action-pretraining
/

LAPA-7B-openx

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

latent-action-pretraining commited on Oct 16

Commit

b81d616

•

1 Parent(s): e6292db

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ base_model:
 <h1 align="center">	LAPA: Latent Action Pretraining from Videos</h1>
 <p align="center">
-<a href="">Hugging Face</a>&nbsp | &nbsp <a href="">Paper</a>&nbsp | &nbsp <a href="">Github</a> &nbsp
 <br>
 - LAPA is the **first unsupervised approach** for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels.
@@ -38,9 +38,9 @@ base_model:
   + **Vision Backbone**: VQGAN
   + **Language Model**: Llama-2
 - **Pretraining Dataset:** [Open X-Embodiment](https://robotics-transformer-x.github.io/)
-- **Repository:**
-- **Paper:**
-- **Project Page & Videos:**
 ### Primary Use Cases

 <h1 align="center">	LAPA: Latent Action Pretraining from Videos</h1>
 <p align="center">
+<a href="https://latentactionpretraining.github.io/">Website</a>&nbsp | &nbsp <a href="https://arxiv.org/abs/2410.11758">Paper</a>&nbsp | &nbsp <a href="https://github.com/LatentActionPretraining/LAPA">Github</a> &nbsp
 <br>
 - LAPA is the **first unsupervised approach** for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels.
   + **Vision Backbone**: VQGAN
   + **Language Model**: Llama-2
 - **Pretraining Dataset:** [Open X-Embodiment](https://robotics-transformer-x.github.io/)
+- **Website:** https://latentactionpretraining.github.io/
+- **Paper:** https://arxiv.org/abs/2410.11758
+- **Code:** https://github.com/LatentActionPretraining/LAPA
 ### Primary Use Cases