view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets 19 days ago • 33
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control Feb 4 • 133
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 24 days ago • 67
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 23 days ago • 300
ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated Jan 23 • 15
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints May 1, 2024 • 74