Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics Paper • 2510.17797 • Published Oct 20, 2025 • 11
xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning Paper • 2510.08439 • Published Oct 9, 2025 • 1
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering Paper • 2511.13998 • Published Nov 17, 2025 • 3
CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows? Paper • 2605.16679 • Published 8 days ago • 49
CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows? Paper • 2605.16679 • Published 8 days ago • 49
CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows? Paper • 2605.16679 • Published 8 days ago • 49
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics Paper • 2510.17797 • Published Oct 20, 2025 • 11
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33 • 2
CoDA Collection CoDA is Salesforce AI Research's open, lightweight and diffusion-based language model. • 2 items • Updated Oct 31, 2025 • 5
UserBench: An Interactive Gym Environment for User-Centric Agents Paper • 2507.22034 • Published Jul 29, 2025 • 30