Spaces:

DistAya
/

README

Running

App Files Files Community

cataluna84 commited on Aug 28

Commit

913d1e8

•

1 Parent(s): 1a757c0

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -8

README.md CHANGED Viewed

@@ -8,21 +8,37 @@ pinned: false
 ---
 Multilingual language models are typically large, requiring significant computational resources.
-Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds?
 # Techniques:
 - Pruning
-  - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
-  - ShortGPT | [KLDBasedPruning & Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)
-- Knowledge Distillation
-  - DistillKit | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
-  - Distil-Whisper based method
-  - On policy distillation of language models
   - Minitron: Compact Language models via Pruning & Knowledge Distillation
   - DistiLLM: Towards Streamlined Distillation for Large Language Models
 - Quantization
-- KV Cache Compression
 - Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)
 # Datasets:

 ---
 Multilingual language models are typically large, requiring significant computational resources.
+![Deployment Challenges](DeploymentChallenges.png)
+Can we create multilingual models that maintain performance comparable to their larger models while reducing size, latency and inference speeds running in production with huge batch sizes?
+![MemoryVariations through time](MemoryVariations(Latency).png)
 # Techniques:
 - Pruning
+  - Unstructured Pruning
+  - Structured Pruning
+  - Semi-Structured Pruning
+  - Methods Used
+    - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
+    - ShortGPT | [KLDBasedPruning & Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)
+- Knowledge Distillation
+  - Hidden State-Based Distillation ~ [DistillKit](https://arcee-ai-distillkit.my.canva.site/) | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
+  - Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
+  - On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
   - Minitron: Compact Language models via Pruning & Knowledge Distillation
   - DistiLLM: Towards Streamlined Distillation for Large Language Models
 - Quantization
+  - Quantization Aware Training (QAT)
+  - Post Training Quantization (PTQ)
+    - KV Cache Quantization
+    - Weight & Activation Quantization
+- Low-Rank Factorization
 - Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)
 # Datasets: