T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 8 days ago • 38
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 12 days ago • 40
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 16 days ago • 120
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models Paper • 2503.18352 • Published 22 days ago • 6
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published 21 days ago • 20
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models Paper • 2503.09669 • Published Mar 12 • 35
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published Mar 10 • 31
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published Mar 7 • 44
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 30 • 7
One Shot, One Talk: Whole-body Talking Avatar from a Single Image Paper • 2412.01106 • Published Dec 2, 2024 • 20
StyleLipSync: Style-based Personalized Lip-sync Video Generation Paper • 2305.00521 • Published Apr 30, 2023
Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation Paper • 2404.00636 • Published Mar 31, 2024
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 30
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation Paper • 2412.00100 • Published Nov 27, 2024 • 16
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 30