whynlp
/

tinyllama-lckv-w2-ft-100b

Text Generation

Model card Files Files and versions Community

whynlp commited on Dec 2, 2024

Commit

d5a8b5e

·

verified ·

1 Parent(s): f008848

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ This is a research-purpose pretrained model described in paper "[Layer-Condensed
 ## About
-Layer-Condensed KV Cache (LCKV) is a variant of transformer decoders in which queries of all layers are paired with keys and values of just the top layer. It reduces the memory and computation cost, reduces the number of parameters, significantly improves the inference throughput with comparable or better task performance. See more details at our github repo: https://github.com/whyNLP/LCKV
 ## Quick Start

 ## About
+Layer-Condensed KV Cache (LCKV) is a variant of transformer decoders in which queries of all layers are paired with keys and values of just the top layer. It reduces the memory and computation cost, reduces the number of parameters, significantly improves the inference throughput with comparable or better task performance. See more details in our github repo: https://github.com/whyNLP/LCKV
 ## Quick Start