T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 3 days ago • 33
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 7 days ago • 34
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 11 days ago • 107
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models Paper • 2503.18352 • Published 17 days ago • 6
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published 16 days ago • 20
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models Paper • 2503.09669 • Published 28 days ago • 35
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published about 1 month ago • 31
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published Mar 7 • 44
One Shot, One Talk: Whole-body Talking Avatar from a Single Image Paper • 2412.01106 • Published Dec 2, 2024 • 20
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation Paper • 2412.00100 • Published Nov 27, 2024 • 16
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 30