Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents Paper • 2302.01560 • Published Feb 3, 2023 • 1
GROOT: Learning to Follow Instructions by Watching Gameplay Videos Paper • 2310.08235 • Published Oct 12, 2023 • 1
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published Oct 23 • 49
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published Jun 27 • 12
MCU: A Task-centric Framework for Open-ended Agent Evaluation in Minecraft Paper • 2310.08367 • Published Oct 12, 2023 • 1
Selecting Large Language Model to Fine-tune via Rectified Scaling Law Paper • 2402.02314 • Published Feb 4 • 2
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models Paper • 2311.05997 • Published Nov 10, 2023 • 36