mayank-mishra
commited on
Commit
•
982f7f2
1
Parent(s):
669c01f
add mmodel
Browse files
README.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1 |
---
|
2 |
license: bigcode-openrail-m
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: bigcode-openrail-m
|
3 |
---
|
4 |
+
|
5 |
+
# GPTQ-for-StarCoder
|
6 |
+
Visit [GPTQ-for-SantaCoder](https://github.com/mayank31398/GPTQ-for-SantaCoder) for instructions on how to use the model weights here.
|
7 |
+
If you want 8-bit weights, visit [starcoder-GPTQ-8bit-128g](https://huggingface.co/mayank31398/starcoder-GPTQ-8bit-128g).
|
8 |
+
|
9 |
+
## Results
|
10 |
+
| StarCoder | Bits | group-size | memory(MiB) | wikitext2 | ptb | c4 | stack | checkpoint size(MB) |
|
11 |
+
| -------------------------------------------------- | ---- | ---------- | ----------- | --------- | ---------- | ---------- | ---------- | ------------------- |
|
12 |
+
| FP32 | 32 | - | | 10.801 | 16.425 | 13.402 | 1.738 | 59195 |
|
13 |
+
| BF16 | 16 | - | | 10.807 | 16.424 | 13.408 | 1.739 | 29597 |
|
14 |
+
| [GPTQ](https://arxiv.org/abs/2210.17323) | 8 | 128 | | 10.805 | 15.453 | 13.408 | 1.739 | 16163 |
|
15 |
+
| [GPTQ](https://arxiv.org/abs/2210.17323) | 4 | 128 | | 10.989 | 16.839 | 13.676 | 1.757 | 8877 |
|
16 |
+
|
17 |
+
# License
|
18 |
+
The model is licenses under the CodeML Open RAIL-M v0.1 license. You can find the full license [here](https://huggingface.co/spaces/bigcode/license).
|
19 |
+
|
20 |
+
# Acknowledgements
|
21 |
+
Thanks to everyone in BigCode who worked so hard to create these code models.
|
model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e131b8262263e2b28f603508c9ba6a2bff621e5205981a19d02cc7e50c3450f0
|
3 |
+
size 9308143245
|