Building on HF
·
AI & ML interests
Finding ways to optimize LLMs' inference performance in resource-constrained environments (e.g. commodity hardware, desktops, laptops, mobiles, edge devices, etc.)
Recent Activity
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article
Cohere on Hugging Face Inference Providers 🔥
- +5
view article
Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach