Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ This is a research-purpose pretrained model described in paper "[Layer-Condensed
|
|
11 |
|
12 |
## About
|
13 |
|
14 |
-
Layer-Condensed KV Cache (LCKV) is a variant of transformer decoders in which queries of all layers are paired with keys and values of just the top layer. It reduces the memory and computation cost, reduces the number of parameters, significantly improves the inference throughput with comparable or better task performance. See more details
|
15 |
|
16 |
## Quick Start
|
17 |
|
|
|
11 |
|
12 |
## About
|
13 |
|
14 |
+
Layer-Condensed KV Cache (LCKV) is a variant of transformer decoders in which queries of all layers are paired with keys and values of just the top layer. It reduces the memory and computation cost, reduces the number of parameters, significantly improves the inference throughput with comparable or better task performance. See more details in our github repo: https://github.com/whyNLP/LCKV
|
15 |
|
16 |
## Quick Start
|
17 |
|