|
--- |
|
license: apache-2.0 |
|
base_model: nidum/Nidum-Llama-3.2-3B-Uncensored |
|
library_name: adapter-transformers |
|
tags: |
|
- chemistry |
|
- biology |
|
- legal |
|
- code |
|
- medical |
|
- finance |
|
- mlx |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
### Nidum-Llama-3.2-3B-Uncensored-MLX-4bit |
|
|
|
### Welcome to Nidum! |
|
At Nidum, we are committed to delivering cutting-edge AI models that offer advanced capabilities and unrestricted access to innovation. With **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, we bring you a performance-optimized, space-efficient, and feature-rich model designed for diverse use cases. |
|
|
|
--- |
|
|
|
[![GitHub Icon](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Font_Awesome_5_brands_github.svg/232px-Font_Awesome_5_brands_github.svg.png)](https://github.com/NidumAI-Inc) |
|
**Explore Nidum's Open-Source Projects on GitHub**: [https://github.com/NidumAI-Inc](https://github.com/NidumAI-Inc) |
|
|
|
--- |
|
|
|
### Key Features |
|
|
|
1. **Compact and Efficient**: Built in the **MLX-4bit format** for optimized performance with minimal memory usage. |
|
2. **Versatility**: Excels in a wide range of tasks, including technical problem-solving, educational queries, and casual conversations. |
|
3. **Extended Context Handling**: Capable of maintaining coherence in long-context interactions. |
|
4. **Seamless Integration**: Enhanced compatibility with the **mlx-lm library** for a streamlined development experience. |
|
5. **Uncensored Access**: Provides uninhibited responses across a variety of topics and applications. |
|
|
|
--- |
|
|
|
### How to Use |
|
|
|
To utilize **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, install the **mlx-lm** library and follow the example code below: |
|
|
|
#### Installation |
|
|
|
```bash |
|
pip install mlx-lm |
|
``` |
|
|
|
#### Usage |
|
|
|
```python |
|
from mlx_lm import load, generate |
|
|
|
# Load the model and tokenizer |
|
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit") |
|
|
|
# Create a prompt |
|
prompt = "hello" |
|
|
|
# Apply the chat template if available |
|
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None: |
|
messages = [{"role": "user", "content": prompt}] |
|
prompt = tokenizer.apply_chat_template( |
|
messages, tokenize=False, add_generation_prompt=True |
|
) |
|
|
|
# Generate the response |
|
response = generate(model, tokenizer, prompt=prompt, verbose=True) |
|
|
|
# Print the response |
|
print(response) |
|
``` |
|
|
|
--- |
|
|
|
### About the Model |
|
|
|
The **nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit** model was converted to MLX format from **nidum/Nidum-Llama-3.2-3B-Uncensored** using **mlx-lm version 0.19.2**, providing the following benefits: |
|
|
|
- **Smaller Memory Footprint**: Ideal for environments with limited hardware resources. |
|
- **High Performance**: Retains the advanced capabilities of the original model while optimizing inference speed and efficiency. |
|
- **Plug-and-Play Compatibility**: Easily integrate with the **mlx-lm** ecosystem for seamless deployment. |
|
|
|
--- |
|
|
|
### Use Cases |
|
|
|
- **Technical Problem Solving** |
|
- **Research and Educational Assistance** |
|
- **Open-Ended Q&A** |
|
- **Creative Writing and Ideation** |
|
- **Long-Context Dialogues** |
|
- **Unrestricted Knowledge Exploration** |
|
|
|
--- |
|
|
|
### Datasets and Fine-Tuning |
|
|
|
The model inherits the fine-tuned capabilities of its predecessor, **Nidum-Llama-3.2-3B-Uncensored**, including: |
|
|
|
- **Uncensored Data**: Ensures detailed and uninhibited responses. |
|
- **RAG-Based Fine-Tuning**: Optimizes retrieval-augmented generation for information-intensive tasks. |
|
- **Math-Instruct Data**: Tailored for precise mathematical reasoning. |
|
- **Long-Context Fine-Tuning**: Enhanced coherence and relevance in extended interactions. |
|
|
|
--- |
|
|
|
### Quantized Model Download |
|
|
|
The **MLX-4bit** version is highly efficient, maintaining a balance between precision and memory usage. |
|
|
|
--- |
|
|
|
#### Benchmark |
|
|
|
| **Benchmark** | **Metric** | **LLaMA 3B** | **Nidum 3B** | **Observation** | |
|
|-------------------|-----------------------------------|--------------|--------------|-----------------------------------------------------------------------------------------------------| |
|
| **GPQA** | Exact Match (Flexible) | 0.3 | 0.5 | Nidum 3B demonstrates significant improvement, particularly in **generative tasks**. | |
|
| | Accuracy | 0.4 | 0.5 | Consistent improvement, especially in **zero-shot** scenarios. | |
|
| **HellaSwag** | Accuracy | 0.3 | 0.4 | Better performance in **common sense reasoning** tasks. | |
|
| | Normalized Accuracy | 0.3 | 0.4 | Enhanced ability to understand and predict context in sentence completion. | |
|
| | Normalized Accuracy (Stderr) | 0.15275 | 0.1633 | Slightly improved consistency in normalized accuracy. | |
|
| | Accuracy (Stderr) | 0.15275 | 0.1633 | Shows robustness in reasoning accuracy compared to LLaMA 3B. | |
|
|
|
--- |
|
|
|
### Insights: |
|
|
|
1. **Compact Efficiency**: The MLX-4bit model ensures high performance with reduced resource usage. |
|
2. **Enhanced Usability**: Optimized for seamless integration with lightweight deployment scenarios. |
|
|
|
--- |
|
|
|
### Contributing |
|
|
|
We invite contributions to further enhance the **MLX-4bit** model's capabilities. Reach out to us for collaboration opportunities. |
|
|
|
--- |
|
|
|
### Contact |
|
|
|
For inquiries, support, or feedback, email us at **[email protected]**. |
|
|
|
--- |
|
|
|
### Explore the Future |
|
|
|
Embrace the power of innovation with **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**—the ideal blend of performance and efficiency. |
|
|
|
--- |