--- title: README emoji: 📊 colorFrom: purple colorTo: gray sdk: static pinned: false --- Multilingual language models are typically large, requiring significant computational resources. Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds? Potential Techniques: - Pruning - SparseGPT - ShortGPT - Knowledge Distillation - Quantization