Collections
Discover the best community collections!
Collections including paper arxiv:2512.08765
-
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 125 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 167 -
LongCat-Image Technical Report
Paper • 2512.07584 • Published • 18
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 9 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 16 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 24
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 2 • 6 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Paper • 2512.05076 • Published • 7 -
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
Paper • 2511.23127 • Published • 43 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128
-
BroRL: Scaling Reinforcement Learning via Broadened Exploration
Paper • 2510.01180 • Published • 18 -
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Paper • 2510.03632 • Published • 41 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
Paper • 2509.23808 • Published • 47
-
EgoX: Egocentric Video Generation from a Single Exocentric Video
Paper • 2512.08269 • Published • 116 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128 -
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Paper • 2512.09363 • Published • 71 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 76
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 125 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 167 -
LongCat-Image Technical Report
Paper • 2512.07584 • Published • 18
-
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Paper • 2512.05076 • Published • 7 -
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
Paper • 2511.23127 • Published • 43 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 9 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 16 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 24
-
BroRL: Scaling Reinforcement Learning via Broadened Exploration
Paper • 2510.01180 • Published • 18 -
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Paper • 2510.03632 • Published • 41 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
Paper • 2509.23808 • Published • 47
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 2 • 6 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
EgoX: Egocentric Video Generation from a Single Exocentric Video
Paper • 2512.08269 • Published • 116 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 128 -
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Paper • 2512.09363 • Published • 71 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 76
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23