Update README.md
Browse files
README.md
CHANGED
@@ -695,17 +695,17 @@ I did this using the simple *nix command `split`.
|
|
695 |
|
696 |
To join the files on any *nix system, run:
|
697 |
```
|
698 |
-
cat gptq_model-4bit--1g.split
|
699 |
```
|
700 |
|
701 |
To join the files on Windows, open a Command Prompt and run:
|
702 |
```
|
703 |
-
COPY /B gptq_model-4bit--1g.
|
704 |
```
|
705 |
|
706 |
The SHA256SUM of the joined file will be:
|
707 |
|
708 |
-
Once you have the joined file, you can safely delete `gptq_model-4bit--1g.split
|
709 |
|
710 |
## Repositories available
|
711 |
|
@@ -714,11 +714,11 @@ Once you have the joined file, you can safely delete `gptq_model-4bit--1g.split*
|
|
714 |
|
715 |
## Two files provided - separate branches
|
716 |
|
717 |
-
- Main branch:
|
718 |
- Group Size = None
|
719 |
- Desc Act (act-order) = True
|
720 |
- This version will use the least possible VRAM, and should have higher inference performance in CUDA mode
|
721 |
-
- Branch `group_size_128g`:
|
722 |
- Group Size = 128g
|
723 |
- Desc Act (act-oder) = True
|
724 |
- This version will use more VRAM, which shouldn't be a problem as it shouldn't exceed 2 x 80GB or 3 x 48GB cards.
|
|
|
695 |
|
696 |
To join the files on any *nix system, run:
|
697 |
```
|
698 |
+
cat gptq_model-4bit--1g.JOINBEFOREUSE.split-*.safetensors > gptq_model-4bit--1g.safetensors
|
699 |
```
|
700 |
|
701 |
To join the files on Windows, open a Command Prompt and run:
|
702 |
```
|
703 |
+
COPY /B gptq_model-4bit--1g.JOINBEFOREUSE.split-a.safetensors + gptq_model-4bit--1g.JOINBEFOREUSE.split-b.safetensors + gptq_model-4bit--1g.JOINBEFOREUSE.split-c.safetensors gptq_model-4bit--1g.safetensors
|
704 |
```
|
705 |
|
706 |
The SHA256SUM of the joined file will be:
|
707 |
|
708 |
+
Once you have the joined file, you can safely delete `gptq_model-4bit--1g.JOINBEFOREUSE.split-*.safetensors`.
|
709 |
|
710 |
## Repositories available
|
711 |
|
|
|
714 |
|
715 |
## Two files provided - separate branches
|
716 |
|
717 |
+
- Main branch: `gptq_model-4bit--1g.safetensors`
|
718 |
- Group Size = None
|
719 |
- Desc Act (act-order) = True
|
720 |
- This version will use the least possible VRAM, and should have higher inference performance in CUDA mode
|
721 |
+
- Branch `group_size_128g`: `gptq_model-4bit-128g.safetensors`
|
722 |
- Group Size = 128g
|
723 |
- Desc Act (act-oder) = True
|
724 |
- This version will use more VRAM, which shouldn't be a problem as it shouldn't exceed 2 x 80GB or 3 x 48GB cards.
|