neuralmagic/Mistral-Nemo-Instruct-2407-quantized.w4a16 Text Generation • Updated Oct 9, 2024 • 2.32k • 3
neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w8a8 Text Generation • Updated Dec 3, 2024 • 393 • 2
neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w8a16 Text Generation • Updated Oct 9, 2024 • 194 • 2
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16_channel-e2e Text Generation • Updated about 22 hours ago • 279
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16_channel-e2e Text Generation • Updated about 22 hours ago • 336
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8A16_channel-e2e Text Generation • Updated about 23 hours ago • 55
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8A16_tensor-e2e Text Generation • Updated about 23 hours ago • 20
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8_DYNAMIC-e2e Text Generation • Updated about 23 hours ago • 58
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_tensor_weight_static_per_tensor_act-e2e Text Generation • Updated about 23 hours ago • 380
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_channel_weight_static_per_tensor-e2e Text Generation • Updated about 23 hours ago • 331