2 358 148

Yuseung "Phillip" Lee

phillipinseoul

https://phillipinseoul.github.io/

phillipinseoul

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper about 16 hours ago

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

liked a model 3 days ago

Qwen/Qwen2.5-VL-32B-Instruct

liked a Space 3 days ago

diffusers/unofficial-SDXL-Turbo-i2i-t2i

View all activity

Organizations

phillipinseoul's activity

upvoted a paper about 16 hours ago

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published 1 day ago • 22

liked a model 3 days ago

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • Updated 4 days ago • 451k • 335

liked a Space 3 days ago

521

Unofficial SDXL Turbo Img2Img Txt2Img

💬

Generate images from text or modify images with prompts

liked a model 3 days ago

OpenGVLab/InternVL3-8B

Image-Text-to-Text • Updated 1 day ago • 11.5k • 32

upvoted 3 papers 3 days ago

upvoted 4 papers 4 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 4 days ago • 219

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published 8 days ago • 39

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Paper • 2504.07615 • Published 8 days ago • 24

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Paper • 2504.08727 • Published 7 days ago • 8

upvoted 3 papers 5 days ago

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Paper • 2504.07934 • Published 8 days ago • 14

Compass Control: Multi Object Orientation Control for Text-to-Image Generation

Paper • 2504.06752 • Published 9 days ago • 7

Kimi-VL Technical Report

Paper • 2504.07491 • Published 9 days ago • 113

upvoted 2 papers 8 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 17 days ago • 79

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Paper • 2504.07956 • Published 8 days ago • 43

liked 2 models 8 days ago

Qwen/Qwen2.5-7B-Instruct

Text Generation • Updated Jan 12 • 2.93M • • 640

HiDream-ai/HiDream-I1-Full

Text-to-Image • Updated 2 days ago • 21.9k • • 583

upvoted 2 papers 10 days ago

An Empirical Study of GPT-4o Image Generation Capabilities

Paper • 2504.05979 • Published 10 days ago • 59

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published 10 days ago • 143