|
--- |
|
base_model: meta-llama/Llama-3.2-3B-Instruct |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- gguf |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
<div align="center"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/669777597cb32718c20d97e9/4emWK_PB-RrifIbrCUjE8.png" |
|
alt="Title card" |
|
style="width: 500px; |
|
height: auto; |
|
object-position: center top;"> |
|
</div> |
|
|
|
**Website -** [https://www.alphaai.biz](https://www.alphaai.biz) |
|
|
|
# TB-Vibe-3B |
|
|
|
### Overview |
|
**TB-Vibe-3B** is a fine-tuned variant of [meta-llama/Llama-3.2-3B-Instruct], specifically crafted to capture **TB's (Founder of Alpha AI)** communication style—direct, witty, and sometimes playfully sarcastic. |
|
|
|
Using **GRPO** and a **custom reward model**, this fine-tuning approach ensures that the AI not only answers questions but does so with TB's hallmark brevity, humor, and clarity. If you want a personal assistant that can be friendly and to the point, TB-Vibe-3B might just be your go-to. |
|
|
|
This model was trained **2x faster** using [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library, enabling quicker iteration on style and tone alignment. |
|
|
|
### Why TB-Vibe-3B? |
|
This isn't your standard chatbot. TB-Vibe-3B blends **concise clarity** with a dash of **playful personality** - it's got that Founder's edge. Whether you're looking for quick answers or a supportive friend, it'll respond with a style that feels engaged and genuine. |
|
|
|
### Model Details |
|
- **Base Model:** meta-llama/Llama-3.2-3B-Instruct |
|
- **Fine-tuned By:** Alpha AI |
|
- **Training Framework:** Unsloth + Hugging Face’s TRL |
|
- **Format:** GGUF (optimized for local deployment) |
|
- **Quantization Levels:** |
|
- q4_k_m |
|
- q5_k_m |
|
- q8_0 |
|
- 16-bit (This, full precision) |
|
|
|
GGUF Versions – https://huggingface.co/alphaaico/TB-Vibe-3B-GGUF |
|
|
|
### Use Cases |
|
- **Personal Assistant:** For day-to-day tasks, scheduling, or casual conversation. |
|
- **Local Chatbot Deployments:** Runs efficiently on standard hardware for real-time chat. |
|
- **Personable Customer Support:** Empathetic, snappy responses that maintain a friendly tone. |
|
|
|
### Model Performance |
|
TB-Vibe-3B aims to: |
|
- Deliver **actionable answers** with minimal fluff. |
|
- Keep it **short, punchy, and witty**—perfect for quick interactions. |
|
- Reflect a **distinct personal vibe**, capturing TB's engaging style. |
|
|
|
### Limitations & Biases |
|
No model is perfect. TB-Vibe-3B inherits any biases present in its base data. It's not an exact human replica of TB—just an AI that channels the essence of TB's style. Use responsibly, especially in professional or sensitive contexts. |
|
|
|
### How You Can Do It Too |
|
Anyone can replicate this style-based tuning with **GRPO** and a tailored reward model. Fine-tune your own base LLM, define your style parameters (tonality, traits, etc.), and apply a reward mechanism that amplifies the characteristics you want. With the right data and some iterative training, you'll have your own style-specific AI in no time. |
|
|
|
### License |
|
Released under **Apache-2.0**. See the license file for full details and conditions. |
|
|
|
### Acknowledgments |
|
Thanks to the Unsloth team for their efficient LLaMA training pipeline and to Hugging Face's TRL library for making advanced fine-tuning approachable. |
|
|
|
TB-Vibe-3B: It's swift, direct, and a touch of witty. Give it a try, and see if it matches your vibe! |
|
|