Simon DL

SimonDL

Simon-dl

AI & ML interests

Reinforcement Learning

Recent Activity

upvoted a paper 29 days ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

upvoted a paper about 1 month ago

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

upvoted a paper about 1 month ago

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

View all activity

Organizations

None yet

SimonDL's activity

upvoted a paper 29 days ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published about 1 month ago • 76

upvoted 5 papers about 1 month ago

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published Nov 25 • 40

liked a dataset about 1 month ago

fka/awesome-chatgpt-prompts

Viewer • Updated Sep 3 • 170 • 6.64k • 6.63k

updated a collection about 2 months ago

Robotics

Collection

5 items • Updated Nov 8

upvoted a paper about 2 months ago

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Paper • 2411.03823 • Published Nov 6 • 43

updated a collection about 2 months ago

Robotics

Collection

5 items • Updated Nov 8

upvoted 5 papers about 2 months ago

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Paper • 2410.21845 • Published Oct 29 • 12

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset

Paper • 2410.22325 • Published Oct 29 • 10

SMITE: Segment Me In TimE

Paper • 2410.18538 • Published Oct 24 • 15

MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark

Paper • 2410.19168 • Published Oct 24 • 19

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published Oct 23 • 49

upvoted 3 papers 2 months ago

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22 • 45

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos

Paper • 2410.16259 • Published Oct 21 • 5