---
title: README
emoji: 📊
colorFrom: purple
colorTo: gray
sdk: static
pinned: false
---

Multilingual language models are typically large, requiring significant computational resources.

Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds?

Potential Techniques:
- Pruning
  - SparseGPT
  - ShortGPT
- Knowledge Distillation
- Quantization