Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization Paper • 2601.23174 • Published Jan 30 • 3
Autoregressive Speech Enhancement via Acoustic Tokens Paper • 2507.12825 • Published Jul 17, 2025 • 1
Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization Paper • 2601.23174 • Published Jan 30 • 3
CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion Paper • 2601.09512 • Published Jan 14 • 4
FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation Paper • 2509.16195 • Published Sep 19, 2025 • 1
Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL Paper • 2505.17952 • Published May 23, 2025 • 20
VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition Paper • 2403.14594 • Published Mar 21, 2024
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published Feb 26, 2025 • 63
Focal Modulation Networks for Interpretable Sound Classification Paper • 2402.02754 • Published Feb 5, 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models? Paper • 2406.10735 • Published Jun 15, 2024