Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees Paper • 2311.08384 • Published Nov 14, 2023
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Paper • 2402.19446 • Published Feb 29
$BT^2$: Backward-compatible Training with Basis Transformation Paper • 2211.03989 • Published Nov 8, 2022
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published Jun 14 • 18
Aligning Large Language Models with Representation Editing: A Control Perspective Paper • 2406.05954 • Published Jun 10
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Paper • 2405.10292 • Published May 16 • 1
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23 • 68
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published Jun 14 • 18
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? Paper • 2311.00047 • Published Oct 31, 2023 • 8
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping Paper • 2304.08025 • Published Apr 17, 2023 • 2
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models Paper • 2305.13655 • Published May 23, 2023 • 7
The ArtBench Dataset: Benchmarking Generative Models with Artworks Paper • 2206.11404 • Published Jun 22, 2022 • 2