SCBench: A KV Cache-Centric Analysis of Long-Context Methods Paper β’ 2412.10319 β’ Published 12 days ago β’ 8
Wolf: Captioning Everything with a World Summarization Framework Paper β’ 2407.18908 β’ Published Jul 26 β’ 31
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper β’ 2407.20183 β’ Published Jul 29 β’ 40
view article Article MInference 1.0: 10x Faster Million Context Inference with a Single GPU By liyucheng β’ Jul 11 β’ 12
view article Article How to Optimize TTFT of 8B LLMs with 1M Tokens toΒ 20s By iofu728 β’ Jul 21 β’ 2
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper β’ 2407.02490 β’ Published Jul 2 β’ 23
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper β’ 2407.02490 β’ Published Jul 2 β’ 23
microsoft/llmlingua-2-xlm-roberta-large-meetingbank Token Classification β’ Updated Apr 3 β’ 34.8k β’ 17
microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank Token Classification β’ Updated Apr 3 β’ 30.6k β’ 24
microsoft/llmlingua-2-xlm-roberta-large-meetingbank Token Classification β’ Updated Apr 3 β’ 34.8k β’ 17