File size: 3,421 Bytes
b2baa74
a5fba22
b2baa74
a5fba22
 
b2baa74
a5fba22
 
 
 
 
b2baa74
 
a5fba22
 
 
 
 
 
 
b2baa74
a5fba22
b2baa74
a5fba22
b2baa74
a5fba22
 
b2baa74
a5fba22
b2baa74
a5fba22
b2baa74
a5fba22
 
b2baa74
a5fba22
 
 
 
 
 
 
 
 
 
b2baa74
a5fba22
b2baa74
a5fba22
 
 
 
b2baa74
a5fba22
 
 
 
 
b2baa74
a5fba22
 
b2baa74
a5fba22
 
b2baa74
a5fba22
 
b2baa74
a5fba22
 
b2baa74
a5fba22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
base_model: meta-llama/Llama-3.2-3B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- gguf
license: apache-2.0
language:
- en
---

<div align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/669777597cb32718c20d97e9/4emWK_PB-RrifIbrCUjE8.png"
     alt="Title card"
     style="width: 500px;
            height: auto;
            object-position: center top;">
</div>

**Website -** [https://www.alphaai.biz](https://www.alphaai.biz)

# TB-Vibe-3B

### Overview
**TB-Vibe-3B** is a fine-tuned variant of [meta-llama/Llama-3.2-3B-Instruct], specifically crafted to capture **TB's (Founder of Alpha AI)** communication style—direct, witty, and sometimes playfully sarcastic. 

Using **GRPO** and a **custom reward model**, this fine-tuning approach ensures that the AI not only answers questions but does so with TB's hallmark brevity, humor, and clarity. If you want a personal assistant that can be friendly and to the point, TB-Vibe-3B might just be your go-to.

This model was trained **2x faster** using [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library, enabling quicker iteration on style and tone alignment.

### Why TB-Vibe-3B?
This isn't your standard chatbot. TB-Vibe-3B blends **concise clarity** with a dash of **playful personality** - it's got that Founder's edge. Whether you're looking for quick answers or a supportive friend, it'll respond with a style that feels engaged and genuine.

### Model Details
- **Base Model:** meta-llama/Llama-3.2-3B-Instruct  
- **Fine-tuned By:** Alpha AI  
- **Training Framework:** Unsloth + Hugging Face’s TRL  
- **Format:** GGUF (optimized for local deployment)  
- **Quantization Levels:**  
  - q4_k_m  
  - q5_k_m  
  - q8_0  
  - 16-bit (This, full precision)

GGUF Versions – https://huggingface.co/alphaaico/TB-Vibe-3B-GGUF

### Use Cases
- **Personal Assistant:** For day-to-day tasks, scheduling, or casual conversation.  
- **Local Chatbot Deployments:** Runs efficiently on standard hardware for real-time chat.  
- **Personable Customer Support:** Empathetic, snappy responses that maintain a friendly tone.  

### Model Performance
TB-Vibe-3B aims to:
- Deliver **actionable answers** with minimal fluff.  
- Keep it **short, punchy, and witty**—perfect for quick interactions.  
- Reflect a **distinct personal vibe**, capturing TB's engaging style.  

### Limitations & Biases
No model is perfect. TB-Vibe-3B inherits any biases present in its base data. It's not an exact human replica of TB—just an AI that channels the essence of TB's style. Use responsibly, especially in professional or sensitive contexts.

### How You Can Do It Too
Anyone can replicate this style-based tuning with **GRPO** and a tailored reward model. Fine-tune your own base LLM, define your style parameters (tonality, traits, etc.), and apply a reward mechanism that amplifies the characteristics you want. With the right data and some iterative training, you'll have your own style-specific AI in no time.

### License
Released under **Apache-2.0**. See the license file for full details and conditions.

### Acknowledgments
Thanks to the Unsloth team for their efficient LLaMA training pipeline and to Hugging Face's TRL library for making advanced fine-tuning approachable. 

TB-Vibe-3B: It's swift, direct, and a touch of witty. Give it a try, and see if it matches your vibe!