view article Article Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e Oct 3, 2023 • 8
view article Article Honesty, Open Source, and the Future of AI in Art: An Open Question By Duskfallcrew • 5 days ago • 3
view article Article Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype By royswastik • 6 days ago • 4
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 12 days ago • 282
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 8 days ago • 96
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 547
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published 11 days ago • 32
view article Article The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... By srinivasbilla • 13 days ago • 52
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 19 days ago • 131
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 20 days ago • 52
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 25 days ago • 86
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 26 days ago • 251
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching Paper • 2412.17153 • Published Dec 22, 2024 • 34
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published Dec 18, 2024 • 14