File size: 1,315 Bytes
fec90af
 
 
 
 
 
 
 
 
1fae59e
 
 
 
d63f2a7
1fae59e
a2fa6ef
d63f2a7
1fae59e
a2fa6ef
d63f2a7
 
 
 
1fae59e
d63f2a7
 
 
 
 
 
 
 
 
 
 
1fae59e
3729049
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
title: README
emoji: πŸ“Š
colorFrom: purple
colorTo: gray
sdk: static
pinned: false
---

Multilingual language models are typically large, requiring significant computational resources.

Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds?

Techniques:
- Pruning
  - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
  - ShortGPT | [Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)
- Knowledge Distillation
  - DistillKit | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
  - Distil-Whisper based method
  - On policy distillation of language models
  - Minitron: Compact Language models via Pruning & Knowledge Distillation
  - DistiLLM: Towards Streamlined Distillation for Large Language Models
- Quantization
- Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)

Dataset:
Initial 7 datasets unified, having 6.62M rows which includes the following:
- Bangla_Alpaca_Orca : Bangle
- Urdu_Instruct_News_Article_Generation: Urdu
- Urdu_Instruct_News_Headline_Generation: Urdu
- Urdu_Instruct_News_Category_Classification: Urdu
- cidar: Arabic
- Six_Millions_Instruction_Dataset_For_Arabic_Llm_Ft: Arabic
- instructv3: English