Running 2.29k 2.29k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression Paper β’ 2407.12077 β’ Published Jul 16, 2024 β’ 56