Edit Models filters

Inference status

Misc

compressed-tensors

Inference Endpoints

AutoTrain Compatible

text-generation-inference

8-bit precision

Misc with no match

4-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

699

Full-text search

Active filters: compressed-tensors

nm-testing/llama2.c-stories15M-gsm8k-pruned.2of4-BitMask

Updated Dec 19, 2024 • 2

horheynm/Phi-3-mini-4k-instruct-kv_cache

Updated Dec 20, 2024 • 3

noneUsername/phi-4-abliterated-W8A8

Updated Dec 21, 2024 • 5

noneUsername/Mistral-Small-Drummer-22B-W8A8

Updated Dec 21, 2024 • 2

andecy64/Nxcode-CQ-7B-orpo-W8A16

Updated Dec 21, 2024 • 63

andecy64/Impish_LLAMA_3B-W8A16

Updated Dec 21, 2024 • 311

noneUsername/Cydonia-22B-v1.3-W8A8

Updated Dec 22, 2024 • 3

noneUsername/L3-8B-Stheno-v3.2-W8A8

Updated Dec 22, 2024 • 5

anhbn/Phi-3.5-mini-instruct-quantized.w8a8

Updated Dec 22, 2024 • 9

BigHuggyD/TheDrummer_Anubis-70B-v1-FP8-Dynamic

Updated Dec 22, 2024 • 53 • 1

Infermatic/Anubis-70B-v1-FP8-Dynamic

Updated 12 days ago • 494

BigHuggyD/gghfez_Writer-Large-2411-v2.1-FP8-Dynamic

Updated Dec 24, 2024 • 30 • 1

mgoin/Llama-3.2-1B-Instruct-FP8-dynamic-ATTN

Updated Dec 23, 2024 • 5

mgoin/Llama-3.2-1B-Instruct-FP8-ATTN

Updated Dec 23, 2024 • 5

horheynm/Llama-3.2-1B-Instruct-FP8-input_activation_channel

Updated Dec 23, 2024 • 4

stan-hua/Mistral-7B-Instruct-v0.3-LC-RTN-W8A8

Updated Dec 25, 2024 • 4

stan-hua/Ministral-8B-Instruct-2410-LC-RTN-W8A8

Updated Dec 25, 2024 • 4

stan-hua/Mistral-Small-Instruct-2409-LC-RTN-W8A8

Updated Dec 25, 2024 • 2

stan-hua/Qwen2-7B-Instruct-LC-RTN-W8A8

Updated Dec 25, 2024 • 2

stan-hua/Mistral-7B-Instruct-v0.3-LC-SmoothQuant-RTN-W8A8

Updated Dec 25, 2024 • 6

stan-hua/Ministral-8B-Instruct-2410-LC-SmoothQuant-RTN-W8A8

Updated Dec 25, 2024 • 4

stan-hua/Mistral-Small-Instruct-2409-LC-SmoothQuant-RTN-W8A8

Updated Dec 25, 2024 • 2

stan-hua/Qwen2-7B-Instruct-LC-SmoothQuant-RTN-W8A8

Updated Dec 25, 2024 • 1

stan-hua/Mistral-7B-Instruct-v0.3-LC-RTN-W4A16

Updated Dec 25, 2024 • 7

stan-hua/Ministral-8B-Instruct-2410-LC-RTN-W4A16

Updated Dec 25, 2024 • 1

stan-hua/Mistral-Small-Instruct-2409-LC-RTN-W4A16

Updated Dec 25, 2024 • 3

stan-hua/Qwen2-7B-Instruct-LC-RTN-W4A16

Updated Dec 25, 2024 • 1

stan-hua/Qwen2-72B-Instruct-LC-RTN-W4A16

Updated Dec 25, 2024 • 1

nexa-collaboration/output_llama7b_2of4_w4a16_stage_sparsity

Updated Dec 25, 2024 • 6

nexa-collaboration/output_llama3.2_1b_2of4_stage_sparsity

Updated 18 days ago • 17