Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 1 day ago • 30
NousResearch/DeepHermes-3-Llama-3-3B-Preview Text Generation • Updated about 9 hours ago • 658 • 10
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 2 days ago • 201
Q-Filters Collection Pre-computed Q-Filters for efficient KV cache compression. • 15 items • Updated 10 days ago • 6
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated about 22 hours ago • 441k • 1.12k