Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 106
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30, 2025 • 13
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper • 2510.10062 • Published Oct 11, 2025 • 10
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Paper • 2510.03204 • Published Oct 3, 2025 • 7
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Paper • 2510.03204 • Published Oct 3, 2025 • 7
LineRetriever: Planning-Aware Observation Reduction for Web Agents Paper • 2507.00210 • Published Jun 30, 2025 • 6
LineRetriever: Planning-Aware Observation Reduction for Web Agents Paper • 2507.00210 • Published Jun 30, 2025 • 6
Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks Paper • 2506.21182 • Published Jun 26, 2025 • 2