What's the difference between the "padded" and the original version?

by Nexesenex - opened Nov 26, 2024

Discussion

Nexesenex

Nov 26, 2024

c.f : EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2

CalamitousFelicitousness

Owner Dec 5, 2024

Padded version has it's weights padded with 0's to enable it to use parallel tensors for Multi-GPU inference after quantization using GPTQ. Otherwise, this would result in an error. You can read more on the bottom of the page here: https://qwen.readthedocs.io/en/latest/quantization/gptq.html

CalamitousFelicitousness changed discussion status to closed Dec 17, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment