GROOT: Learning to Follow Instructions by Watching Gameplay Videos Paper • 2310.08235 • Published Oct 12, 2023 • 1
MCU: A Task-centric Framework for Open-ended Agent Evaluation in Minecraft Paper • 2310.08367 • Published Oct 12, 2023 • 1
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners Paper • 2305.14825 • Published May 24, 2023 • 1
Selecting Large Language Model to Fine-tune via Rectified Scaling Law Paper • 2402.02314 • Published Feb 4, 2024 • 2
ProAgent: Building Proactive Cooperative AI with Large Language Models Paper • 2308.11339 • Published Aug 22, 2023
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Paper • 2403.05313 • Published Mar 8, 2024 • 9
Neural-Symbolic Recursive Machine for Systematic Generalization Paper • 2210.01603 • Published Oct 4, 2022
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents Paper • 2302.01560 • Published Feb 3, 2023 • 1
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction Paper • 2301.10034 • Published Jan 21, 2023
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published Jun 27, 2024 • 13
Understanding the Distillation Process from Deep Generative Models to Tractable Probabilistic Circuits Paper • 2302.08086 • Published Feb 16, 2023
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published Oct 23, 2024 • 52
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping Paper • 2502.20900 • Published Feb 28 • 9
Generative Evaluation of Complex Reasoning in Large Language Models Paper • 2504.02810 • Published 17 days ago • 12
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment Paper • 2503.02505 • Published Mar 4 • 6
Generative Evaluation of Complex Reasoning in Large Language Models Paper • 2504.02810 • Published 17 days ago • 12
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 39