PromptBridge: Cross-Model Prompt Transfer for Large Language Models Paper • 2512.01420 • Published 25 days ago • 9
PromptBridge: Cross-Model Prompt Transfer for Large Language Models Paper • 2512.01420 • Published 25 days ago • 9 • 2
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark Paper • 2511.17729 • Published Nov 21 • 16
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark Paper • 2511.17729 • Published Nov 21 • 16 • 2
R-WoM: Retrieval-augmented World Model For Computer-use Agents Paper • 2510.11892 • Published Oct 13 • 21
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26 • 134
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26 • 134 • 2
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26 • 134
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63 • 5
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63 • 5
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_nocl_global_step_100 8B • Updated May 26 • 6
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_cl_global_step_100 8B • Updated May 25 • 7
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_nocl_global_step_50 8B • Updated May 24 • 8
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_cl_global_step_50 8B • Updated May 24 • 15