cataluna84 commited on
Commit
913d1e8
1 Parent(s): 1a757c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -8
README.md CHANGED
@@ -8,21 +8,37 @@ pinned: false
8
  ---
9
 
10
  Multilingual language models are typically large, requiring significant computational resources.
 
11
 
12
- Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds?
 
13
 
14
  # Techniques:
 
15
  - Pruning
16
- - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
17
- - ShortGPT | [KLDBasedPruning & Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)
18
- - Knowledge Distillation
19
- - DistillKit | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
20
- - Distil-Whisper based method
21
- - On policy distillation of language models
 
 
 
 
 
 
22
  - Minitron: Compact Language models via Pruning & Knowledge Distillation
23
  - DistiLLM: Towards Streamlined Distillation for Large Language Models
 
24
  - Quantization
25
- - KV Cache Compression
 
 
 
 
 
 
26
  - Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)
27
 
28
  # Datasets:
 
8
  ---
9
 
10
  Multilingual language models are typically large, requiring significant computational resources.
11
+ ![Deployment Challenges](DeploymentChallenges.png)
12
 
13
+ Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds running in production with huge batch sizes?
14
+ ![MemoryVariations through time](MemoryVariations(Latency).png)
15
 
16
  # Techniques:
17
+
18
  - Pruning
19
+ - Unstructured Pruning
20
+ - Structured Pruning
21
+ - Semi-Structured Pruning
22
+
23
+ - Methods Used
24
+ - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
25
+ - ShortGPT | [KLDBasedPruning & Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)
26
+
27
+ - Knowledge Distillation
28
+ - Hidden State-Based Distillation ~ [DistillKit](https://arcee-ai-distillkit.my.canva.site/) | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
29
+ - Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
30
+ - On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
31
  - Minitron: Compact Language models via Pruning & Knowledge Distillation
32
  - DistiLLM: Towards Streamlined Distillation for Large Language Models
33
+
34
  - Quantization
35
+ - Quantization Aware Training (QAT)
36
+ - Post Training Quantization (PTQ)
37
+ - KV Cache Quantization
38
+ - Weight & Activation Quantization
39
+
40
+ - Low-Rank Factorization
41
+
42
  - Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)
43
 
44
  # Datasets: