Update README.md

9fce119 verified about 1 month ago

5.91 kB

	---
	license: apache-2.0
	base_model: nidum/Nidum-Llama-3.2-3B-Uncensored
	library_name: adapter-transformers
	tags:
	- chemistry
	- biology
	- legal
	- code
	- medical
	- finance
	- mlx
	pipeline_tag: text-generation
	---

	### Nidum-Llama-3.2-3B-Uncensored-MLX-4bit

	### Welcome to Nidum!
	At Nidum, we are committed to delivering cutting-edge AI models that offer advanced capabilities and unrestricted access to innovation. With Nidum-Llama-3.2-3B-Uncensored-MLX-4bit, we bring you a performance-optimized, space-efficient, and feature-rich model designed for diverse use cases.

	---

	[![GitHub Icon](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Font_Awesome_5_brands_github.svg/232px-Font_Awesome_5_brands_github.svg.png)](https://github.com/NidumAI-Inc)
	Explore Nidum's Open-Source Projects on GitHub: [https://github.com/NidumAI-Inc](https://github.com/NidumAI-Inc)

	---

	### Key Features

	1. Compact and Efficient: Built in the MLX-4bit format for optimized performance with minimal memory usage.
	2. Versatility: Excels in a wide range of tasks, including technical problem-solving, educational queries, and casual conversations.
	3. Extended Context Handling: Capable of maintaining coherence in long-context interactions.
	4. Seamless Integration: Enhanced compatibility with the mlx-lm library for a streamlined development experience.
	5. Uncensored Access: Provides uninhibited responses across a variety of topics and applications.

	---

	### How to Use

	To utilize Nidum-Llama-3.2-3B-Uncensored-MLX-4bit, install the mlx-lm library and follow the example code below:

	#### Installation

	```bash
	pip install mlx-lm
	```

	#### Usage

	```python
	from mlx_lm import load, generate

	# Load the model and tokenizer
	model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit")

	# Create a prompt
	prompt = "hello"

	# Apply the chat template if available
	if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, tokenize=False, add_generation_prompt=True
	)

	# Generate the response
	response = generate(model, tokenizer, prompt=prompt, verbose=True)

	# Print the response
	print(response)
	```

	---

	### About the Model

	The nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-4bit model was converted to MLX format from nidum/Nidum-Llama-3.2-3B-Uncensored using mlx-lm version 0.19.2, providing the following benefits:

	- Smaller Memory Footprint: Ideal for environments with limited hardware resources.
	- High Performance: Retains the advanced capabilities of the original model while optimizing inference speed and efficiency.
	- Plug-and-Play Compatibility: Easily integrate with the mlx-lm ecosystem for seamless deployment.

	---

	### Use Cases

	- Technical Problem Solving
	- Research and Educational Assistance
	- Open-Ended Q&A
	- Creative Writing and Ideation
	- Long-Context Dialogues
	- Unrestricted Knowledge Exploration

	---

	### Datasets and Fine-Tuning

	The model inherits the fine-tuned capabilities of its predecessor, Nidum-Llama-3.2-3B-Uncensored, including:

	- Uncensored Data: Ensures detailed and uninhibited responses.
	- RAG-Based Fine-Tuning: Optimizes retrieval-augmented generation for information-intensive tasks.
	- Math-Instruct Data: Tailored for precise mathematical reasoning.
	- Long-Context Fine-Tuning: Enhanced coherence and relevance in extended interactions.

	---

	### Quantized Model Download

	The MLX-4bit version is highly efficient, maintaining a balance between precision and memory usage.

	---

	#### Benchmark

	\| Benchmark \| Metric \| LLaMA 3B \| Nidum 3B \| Observation \|
	\|-------------------\|-----------------------------------\|--------------\|--------------\|-----------------------------------------------------------------------------------------------------\|
	\| GPQA \| Exact Match (Flexible) \| 0.3 \| 0.5 \| Nidum 3B demonstrates significant improvement, particularly in generative tasks. \|
	\| \| Accuracy \| 0.4 \| 0.5 \| Consistent improvement, especially in zero-shot scenarios. \|
	\| HellaSwag \| Accuracy \| 0.3 \| 0.4 \| Better performance in common sense reasoning tasks. \|
	\| \| Normalized Accuracy \| 0.3 \| 0.4 \| Enhanced ability to understand and predict context in sentence completion. \|
	\| \| Normalized Accuracy (Stderr) \| 0.15275 \| 0.1633 \| Slightly improved consistency in normalized accuracy. \|
	\| \| Accuracy (Stderr) \| 0.15275 \| 0.1633 \| Shows robustness in reasoning accuracy compared to LLaMA 3B. \|

	---

	### Insights:

	1. Compact Efficiency: The MLX-4bit model ensures high performance with reduced resource usage.
	2. Enhanced Usability: Optimized for seamless integration with lightweight deployment scenarios.

	---

	### Contributing

	We invite contributions to further enhance the MLX-4bit model's capabilities. Reach out to us for collaboration opportunities.

	---

	### Contact

	For inquiries, support, or feedback, email us at [email protected].

	---

	### Explore the Future

	Embrace the power of innovation with Nidum-Llama-3.2-3B-Uncensored-MLX-4bit—the ideal blend of performance and efficiency.

	---