An Investigation of FP8 Across Accelerators for LLM Inference Paper • 2502.01070 • Published Feb 3 • 3
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 • 71
view article Article Organizing a Privacy-preserving Hackathon By binoua and 1 other • Oct 17, 2024 • 9