rogkesavan commited on
Commit
9fce119
1 Parent(s): 84de7c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -4
README.md CHANGED
@@ -13,28 +13,137 @@ tags:
13
  pipeline_tag: text-generation
14
  ---
15
 
16
- # nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit
17
 
18
- The Model [nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit](https://huggingface.co/nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit) was converted to MLX format from [nidum/Nidum-Llama-3.2-3B-Uncensored](https://huggingface.co/nidum/Nidum-Llama-3.2-3B-Uncensored) using mlx-lm version **0.19.2**.
 
19
 
20
- ## Use with mlx
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ```bash
23
  pip install mlx-lm
24
  ```
25
 
 
 
26
  ```python
27
  from mlx_lm import load, generate
28
 
 
29
  model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")
30
 
31
- prompt="hello"
 
32
 
 
33
  if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
34
  messages = [{"role": "user", "content": prompt}]
35
  prompt = tokenizer.apply_chat_template(
36
  messages, tokenize=False, add_generation_prompt=True
37
  )
38
 
 
39
  response = generate(model, tokenizer, prompt=prompt, verbose=True)
 
 
 
40
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  pipeline_tag: text-generation
14
  ---
15
 
16
+ ### Nidum-Llama-3.2-3B-Uncensored-MLX-4bit
17
 
18
+ ### Welcome to Nidum!
19
+ At Nidum, we are committed to delivering cutting-edge AI models that offer advanced capabilities and unrestricted access to innovation. With **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, we bring you a performance-optimized, space-efficient, and feature-rich model designed for diverse use cases.
20
 
21
+ ---
22
+
23
+ [![GitHub Icon](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Font_Awesome_5_brands_github.svg/232px-Font_Awesome_5_brands_github.svg.png)](https://github.com/NidumAI-Inc)
24
+ **Explore Nidum's Open-Source Projects on GitHub**: [https://github.com/NidumAI-Inc](https://github.com/NidumAI-Inc)
25
+
26
+ ---
27
+
28
+ ### Key Features
29
+
30
+ 1. **Compact and Efficient**: Built in the **MLX-4bit format** for optimized performance with minimal memory usage.
31
+ 2. **Versatility**: Excels in a wide range of tasks, including technical problem-solving, educational queries, and casual conversations.
32
+ 3. **Extended Context Handling**: Capable of maintaining coherence in long-context interactions.
33
+ 4. **Seamless Integration**: Enhanced compatibility with the **mlx-lm library** for a streamlined development experience.
34
+ 5. **Uncensored Access**: Provides uninhibited responses across a variety of topics and applications.
35
+
36
+ ---
37
+
38
+ ### How to Use
39
+
40
+ To utilize **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**, install the **mlx-lm** library and follow the example code below:
41
+
42
+ #### Installation
43
 
44
  ```bash
45
  pip install mlx-lm
46
  ```
47
 
48
+ #### Usage
49
+
50
  ```python
51
  from mlx_lm import load, generate
52
 
53
+ # Load the model and tokenizer
54
  model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")
55
 
56
+ # Create a prompt
57
+ prompt = "hello"
58
 
59
+ # Apply the chat template if available
60
  if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
61
  messages = [{"role": "user", "content": prompt}]
62
  prompt = tokenizer.apply_chat_template(
63
  messages, tokenize=False, add_generation_prompt=True
64
  )
65
 
66
+ # Generate the response
67
  response = generate(model, tokenizer, prompt=prompt, verbose=True)
68
+
69
+ # Print the response
70
+ print(response)
71
  ```
72
+
73
+ ---
74
+
75
+ ### About the Model
76
+
77
+ The **nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit** model was converted to MLX format from **nidum/Nidum-Llama-3.2-3B-Uncensored** using **mlx-lm version 0.19.2**, providing the following benefits:
78
+
79
+ - **Smaller Memory Footprint**: Ideal for environments with limited hardware resources.
80
+ - **High Performance**: Retains the advanced capabilities of the original model while optimizing inference speed and efficiency.
81
+ - **Plug-and-Play Compatibility**: Easily integrate with the **mlx-lm** ecosystem for seamless deployment.
82
+
83
+ ---
84
+
85
+ ### Use Cases
86
+
87
+ - **Technical Problem Solving**
88
+ - **Research and Educational Assistance**
89
+ - **Open-Ended Q&A**
90
+ - **Creative Writing and Ideation**
91
+ - **Long-Context Dialogues**
92
+ - **Unrestricted Knowledge Exploration**
93
+
94
+ ---
95
+
96
+ ### Datasets and Fine-Tuning
97
+
98
+ The model inherits the fine-tuned capabilities of its predecessor, **Nidum-Llama-3.2-3B-Uncensored**, including:
99
+
100
+ - **Uncensored Data**: Ensures detailed and uninhibited responses.
101
+ - **RAG-Based Fine-Tuning**: Optimizes retrieval-augmented generation for information-intensive tasks.
102
+ - **Math-Instruct Data**: Tailored for precise mathematical reasoning.
103
+ - **Long-Context Fine-Tuning**: Enhanced coherence and relevance in extended interactions.
104
+
105
+ ---
106
+
107
+ ### Quantized Model Download
108
+
109
+ The **MLX-4bit** version is highly efficient, maintaining a balance between precision and memory usage.
110
+
111
+ ---
112
+
113
+ #### Benchmark
114
+
115
+ | **Benchmark** | **Metric** | **LLaMA 3B** | **Nidum 3B** | **Observation** |
116
+ |-------------------|-----------------------------------|--------------|--------------|-----------------------------------------------------------------------------------------------------|
117
+ | **GPQA** | Exact Match (Flexible) | 0.3 | 0.5 | Nidum 3B demonstrates significant improvement, particularly in **generative tasks**. |
118
+ | | Accuracy | 0.4 | 0.5 | Consistent improvement, especially in **zero-shot** scenarios. |
119
+ | **HellaSwag** | Accuracy | 0.3 | 0.4 | Better performance in **common sense reasoning** tasks. |
120
+ | | Normalized Accuracy | 0.3 | 0.4 | Enhanced ability to understand and predict context in sentence completion. |
121
+ | | Normalized Accuracy (Stderr) | 0.15275 | 0.1633 | Slightly improved consistency in normalized accuracy. |
122
+ | | Accuracy (Stderr) | 0.15275 | 0.1633 | Shows robustness in reasoning accuracy compared to LLaMA 3B. |
123
+
124
+ ---
125
+
126
+ ### Insights:
127
+
128
+ 1. **Compact Efficiency**: The MLX-4bit model ensures high performance with reduced resource usage.
129
+ 2. **Enhanced Usability**: Optimized for seamless integration with lightweight deployment scenarios.
130
+
131
+ ---
132
+
133
+ ### Contributing
134
+
135
+ We invite contributions to further enhance the **MLX-4bit** model's capabilities. Reach out to us for collaboration opportunities.
136
+
137
+ ---
138
+
139
+ ### Contact
140
+
141
+ For inquiries, support, or feedback, email us at **[email protected]**.
142
+
143
+ ---
144
+
145
+ ### Explore the Future
146
+
147
+ Embrace the power of innovation with **Nidum-Llama-3.2-3B-Uncensored-MLX-4bit**—the ideal blend of performance and efficiency.
148
+
149
+ ---