rogkesavan
commited on
Commit
•
9fce119
1
Parent(s):
84de7c2
Update README.md
Browse files
README.md
CHANGED
@@ -13,28 +13,137 @@ tags:
|
|
13 |
pipeline_tag: text-generation
|
14 |
---
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
|
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
```bash
|
23 |
pip install mlx-lm
|
24 |
```
|
25 |
|
|
|
|
|
26 |
```python
|
27 |
from mlx_lm import load, generate
|
28 |
|
|
|
29 |
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")
|
30 |
|
31 |
-
prompt
|
|
|
32 |
|
|
|
33 |
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
|
34 |
messages = [{"role": "user", "content": prompt}]
|
35 |
prompt = tokenizer.apply_chat_template(
|
36 |
messages, tokenize=False, add_generation_prompt=True
|
37 |
)
|
38 |
|
|
|
39 |
response = generate(model, tokenizer, prompt=prompt, verbose=True)
|
|
|
|
|
|
|
40 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
pipeline_tag: text-generation
|
14 |
---
|
15 |
|
16 |
+
### Nidum-Llama-3.2-3B-Uncensored-MLX-4bit
|
17 |
|
18 |
+
### Welcome to Nidum!
|
19 |
+
At Nidum, we are committed to delivering cutting-edge AI models that offer advanced capabilities and unrestricted access to innovation. With **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, we bring you a performance-optimized, space-efficient, and feature-rich model designed for diverse use cases.
|
20 |
|
21 |
+
---
|
22 |
+
|
23 |
+
[![GitHub Icon](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Font_Awesome_5_brands_github.svg/232px-Font_Awesome_5_brands_github.svg.png)](https://github.com/NidumAI-Inc)
|
24 |
+
**Explore Nidum's Open-Source Projects on GitHub**: [https://github.com/NidumAI-Inc](https://github.com/NidumAI-Inc)
|
25 |
+
|
26 |
+
---
|
27 |
+
|
28 |
+
### Key Features
|
29 |
+
|
30 |
+
1. **Compact and Efficient**: Built in the **MLX-4bit format** for optimized performance with minimal memory usage.
|
31 |
+
2. **Versatility**: Excels in a wide range of tasks, including technical problem-solving, educational queries, and casual conversations.
|
32 |
+
3. **Extended Context Handling**: Capable of maintaining coherence in long-context interactions.
|
33 |
+
4. **Seamless Integration**: Enhanced compatibility with the **mlx-lm library** for a streamlined development experience.
|
34 |
+
5. **Uncensored Access**: Provides uninhibited responses across a variety of topics and applications.
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
### How to Use
|
39 |
+
|
40 |
+
To utilize **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, install the **mlx-lm** library and follow the example code below:
|
41 |
+
|
42 |
+
#### Installation
|
43 |
|
44 |
```bash
|
45 |
pip install mlx-lm
|
46 |
```
|
47 |
|
48 |
+
#### Usage
|
49 |
+
|
50 |
```python
|
51 |
from mlx_lm import load, generate
|
52 |
|
53 |
+
# Load the model and tokenizer
|
54 |
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")
|
55 |
|
56 |
+
# Create a prompt
|
57 |
+
prompt = "hello"
|
58 |
|
59 |
+
# Apply the chat template if available
|
60 |
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
|
61 |
messages = [{"role": "user", "content": prompt}]
|
62 |
prompt = tokenizer.apply_chat_template(
|
63 |
messages, tokenize=False, add_generation_prompt=True
|
64 |
)
|
65 |
|
66 |
+
# Generate the response
|
67 |
response = generate(model, tokenizer, prompt=prompt, verbose=True)
|
68 |
+
|
69 |
+
# Print the response
|
70 |
+
print(response)
|
71 |
```
|
72 |
+
|
73 |
+
---
|
74 |
+
|
75 |
+
### About the Model
|
76 |
+
|
77 |
+
The **nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit** model was converted to MLX format from **nidum/Nidum-Llama-3.2-3B-Uncensored** using **mlx-lm version 0.19.2**, providing the following benefits:
|
78 |
+
|
79 |
+
- **Smaller Memory Footprint**: Ideal for environments with limited hardware resources.
|
80 |
+
- **High Performance**: Retains the advanced capabilities of the original model while optimizing inference speed and efficiency.
|
81 |
+
- **Plug-and-Play Compatibility**: Easily integrate with the **mlx-lm** ecosystem for seamless deployment.
|
82 |
+
|
83 |
+
---
|
84 |
+
|
85 |
+
### Use Cases
|
86 |
+
|
87 |
+
- **Technical Problem Solving**
|
88 |
+
- **Research and Educational Assistance**
|
89 |
+
- **Open-Ended Q&A**
|
90 |
+
- **Creative Writing and Ideation**
|
91 |
+
- **Long-Context Dialogues**
|
92 |
+
- **Unrestricted Knowledge Exploration**
|
93 |
+
|
94 |
+
---
|
95 |
+
|
96 |
+
### Datasets and Fine-Tuning
|
97 |
+
|
98 |
+
The model inherits the fine-tuned capabilities of its predecessor, **Nidum-Llama-3.2-3B-Uncensored**, including:
|
99 |
+
|
100 |
+
- **Uncensored Data**: Ensures detailed and uninhibited responses.
|
101 |
+
- **RAG-Based Fine-Tuning**: Optimizes retrieval-augmented generation for information-intensive tasks.
|
102 |
+
- **Math-Instruct Data**: Tailored for precise mathematical reasoning.
|
103 |
+
- **Long-Context Fine-Tuning**: Enhanced coherence and relevance in extended interactions.
|
104 |
+
|
105 |
+
---
|
106 |
+
|
107 |
+
### Quantized Model Download
|
108 |
+
|
109 |
+
The **MLX-4bit** version is highly efficient, maintaining a balance between precision and memory usage.
|
110 |
+
|
111 |
+
---
|
112 |
+
|
113 |
+
#### Benchmark
|
114 |
+
|
115 |
+
| **Benchmark** | **Metric** | **LLaMA 3B** | **Nidum 3B** | **Observation** |
|
116 |
+
|-------------------|-----------------------------------|--------------|--------------|-----------------------------------------------------------------------------------------------------|
|
117 |
+
| **GPQA** | Exact Match (Flexible) | 0.3 | 0.5 | Nidum 3B demonstrates significant improvement, particularly in **generative tasks**. |
|
118 |
+
| | Accuracy | 0.4 | 0.5 | Consistent improvement, especially in **zero-shot** scenarios. |
|
119 |
+
| **HellaSwag** | Accuracy | 0.3 | 0.4 | Better performance in **common sense reasoning** tasks. |
|
120 |
+
| | Normalized Accuracy | 0.3 | 0.4 | Enhanced ability to understand and predict context in sentence completion. |
|
121 |
+
| | Normalized Accuracy (Stderr) | 0.15275 | 0.1633 | Slightly improved consistency in normalized accuracy. |
|
122 |
+
| | Accuracy (Stderr) | 0.15275 | 0.1633 | Shows robustness in reasoning accuracy compared to LLaMA 3B. |
|
123 |
+
|
124 |
+
---
|
125 |
+
|
126 |
+
### Insights:
|
127 |
+
|
128 |
+
1. **Compact Efficiency**: The MLX-4bit model ensures high performance with reduced resource usage.
|
129 |
+
2. **Enhanced Usability**: Optimized for seamless integration with lightweight deployment scenarios.
|
130 |
+
|
131 |
+
---
|
132 |
+
|
133 |
+
### Contributing
|
134 |
+
|
135 |
+
We invite contributions to further enhance the **MLX-4bit** model's capabilities. Reach out to us for collaboration opportunities.
|
136 |
+
|
137 |
+
---
|
138 |
+
|
139 |
+
### Contact
|
140 |
+
|
141 |
+
For inquiries, support, or feedback, email us at **[email protected]**.
|
142 |
+
|
143 |
+
---
|
144 |
+
|
145 |
+
### Explore the Future
|
146 |
+
|
147 |
+
Embrace the power of innovation with **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**—the ideal blend of performance and efficiency.
|
148 |
+
|
149 |
+
---
|