Compiled engines for running Whisper with TRT LLM for much faster inference.
AI & ML interests
None defined yet.
Recent Activity
View all activity
models 673
baseten/Qwen3-1.7B-NVFP4-PTQ
1B • Updated • 13
baseten/Qwen3-4B-NVFP4-PTQ
2B • Updated • 13
baseten/Qwen-Image-2512-Pruned-50blocks
Text-to-Image • Updated • 87
baseten/embedding-smol_llama-101M-GQA
76.6M • Updated • 43
baseten/qwen3-engine-30A3-repro
Updated • 3
baseten/whisper_trt_large_v3_turbo_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated
baseten/whisper_trt_large_v2_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated
baseten/whisper_trt_large_v3_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated
baseten/whisper_trt_large_v3_251013_NVIDIA_L4_0_21_0
Updated
baseten/whisper_trt_large_v3_turbo_251013_NVIDIA_L4_0_21_0
Updated