neuralmagic
/

Qwen2-72B-Instruct-quantized.w4a16

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Qwen2-72B-Instruct-quantized.w4a16

2 contributors

History: 7 commits

abhinavnmagic's picture

Upload special_tokens_map.json with huggingface_hub

a6350d3 verified 7 months ago

.gitattributes

1.52 kB

initial commit 7 months ago
added_tokens.json

80 Bytes

Upload added_tokens.json with huggingface_hub 7 months ago
config.json

1.03 kB

Upload config.json with huggingface_hub 7 months ago
merges.txt

1.67 MB

Upload merges.txt with huggingface_hub 7 months ago
model.safetensors

41.5 GB
LFS

Upload model.safetensors with huggingface_hub 7 months ago
quantize_config.json

269 Bytes

Upload quantize_config.json with huggingface_hub 7 months ago
special_tokens_map.json

367 Bytes

Upload special_tokens_map.json with huggingface_hub 7 months ago