Chmielewski's picture

Chmielewski

Eryk-Chmielewski

·

AI & ML interests

None yet

Recent Activity

liked a model about 6 hours ago

meta-llama/Llama-4-Maverick-17B-128E-Original

liked a model about 6 hours ago

meta-llama/Llama-4-Maverick-17B-128E-Instruct-Original

liked a model about 6 hours ago

meta-llama/Llama-4-Scout-17B-16E-Original

View all activity

Organizations

Eryk-Chmielewski's activity

upvoted a collection about 6 hours ago

Llama 4

Llama 4 release • 10 items • Updated about 6 hours ago • 186

upvoted 8 papers 1 day ago

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Paper • 2503.22165 • Published 9 days ago • 23

Interpreting Emergent Planning in Model-Free Reinforcement Learning

Paper • 2504.01871 • Published 3 days ago • 10

Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Paper • 2503.23157 • Published 7 days ago • 5

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published 4 days ago • 22

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published 10 days ago • 30

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published 3 days ago • 23

Efficient Model Selection for Time Series Forecasting via LLMs

Paper • 2504.02119 • Published 3 days ago • 10

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 5 days ago • 121

upvoted a paper 3 days ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published 4 days ago • 18

upvoted a paper 5 days ago

Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published 6 days ago • 16

upvoted an article 8 days ago

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Feb 12

• 55

upvoted 8 papers 9 days ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3 • 59

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published 20 days ago • 27

Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Paper • 2503.12303 • Published 21 days ago • 7

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published 17 days ago • 9

STEVE: AStep Verification Pipeline for Computer-use Agent Training

Paper • 2503.12532 • Published 20 days ago • 14

φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

Paper • 2503.13288 • Published 19 days ago • 48

Why Do Multi-Agent LLM Systems Fail?

Paper • 2503.13657 • Published 19 days ago • 41

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published 16 days ago • 46