Syed-Hasan-8503
/

Gemma-2-2b-it-distilled

Model card Files Files and versions Community

Syed-Hasan-8503 commited on Aug 8

Commit

280bf01

•

1 Parent(s): 8d7551b

Create README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+license: apache-2.0
+datasets:
+- mlabonne/FineTome-100k
+---
+# Distilled Google Gemma-2-2b-it
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e09e72e43b9464c835735f/G0Q--v5zaiCKW96xm8Mhr.png)
+## Model Description
+This model is a distilled version of Google's Gemma-2-2b-it, created through knowledge distillation from the larger Gemma-2-9b-it model. The distillation process was performed using arcee-ai DistilKit, focusing on preserving the capabilities of the larger model in a more compact form.
+### Key Features
+- **Base Model**: Google Gemma-2-2b-it
+- **Teacher Model**: Google Gemma-2-9b-it
+- **Distillation Tool**: arcee-ai DistilKit
+- **Training Data**: Subset of mlabonne/Tome dataset (30,000 rows)
+- **Distillation Method**: Logit-based distillation
+## Distillation Process
+The distillation process involved transferring knowledge from the larger Gemma-2-9b-it model to the smaller Gemma-2-2b-it model. This was achieved using arcee-ai DistilKit, which offers several key features:
+1. **Logit-based Distillation**: This method ensures that the student model (Gemma-2-2b-it) learns to mimic the output distribution of the teacher model (Gemma-2-9b-it).
+2. **Architectural Consistency**: Both the teacher and student models share the same architecture, allowing for direct logit-based distillation.
+## Dataset
+The model was trained on a subset of the mlabonne/Tome dataset, utilizing 30,000 rows due to computational constraints. This dataset was chosen for its quality and relevance to the target tasks of the model.
+## Model Limitations
+While this distilled model retains much of the capability of its larger counterpart, users should be aware of potential limitations:
+- Slightly reduced performance compared to the original Gemma-2-9b-it model
+- Limited to the scope of tasks covered in the training data
+- May not perform as well on highly specialized or domain-specific tasks
+## Usage
+Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with:
+```sh
+pip install -U transformers
+```
+Then, copy the snippet from the section that is relevant for your usecase.
+#### Running with the `pipeline` API
+```python
+import torch
+from transformers import pipeline
+pipe = pipeline(
+    "text-generation",
+    model="Syed-Hasan-8503/Gemma-2-2b-it-distilled",
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device="cuda",  # replace with "mps" to run on a Mac device
+)
+messages = [
+    {"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
+]
+outputs = pipe(messages, max_new_tokens=256)
+assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
+print(assistant_response)
+# Ahoy, matey! I be Gemma, a digital scallywag, a language-slingin' parrot of the digital seas. I be here to help ye with yer wordy woes, answer yer questions, and spin ye yarns of the digital world.  So, what be yer pleasure, eh? 🦜
+```