Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning Paper • 2503.04973 • Published 7 days ago • 18 • 7
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning Paper • 2503.04973 • Published 7 days ago • 18 • 7
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries Paper • 2502.20475 • Published 14 days ago • 2 • 4
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning Paper • 2503.04973 • Published 7 days ago • 18 • 7
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13, 2024 • 66 • 6
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 23 • 5
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 23 • 5
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 93 • 8
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction Paper • 2412.09573 • Published Dec 12, 2024 • 8 • 3
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models Paper • 2412.08629 • Published Dec 11, 2024 • 12 • 4
HARP: Hesitation-Aware Reframing in Transformer Inference Pass Paper • 2412.07282 • Published Dec 10, 2024 • 4 • 3