Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,67 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
|
5 |
+
# J.O.S.I.E.v6-2b
|
6 |
+
|
7 |
+
## Overview
|
8 |
+
|
9 |
+
This is a crude proof of concept (PoC) demonstrating the feasibility of fine-tuning a large language model (LLM) on Apple Silicon using the MLX-LM framework. The goal is to explore the capabilities of Apple’s hardware for local LLM training and fine-tuning workflows.
|
10 |
+
|
11 |
+
## Model and Training Details
|
12 |
+
|
13 |
+
- **Base Model:** `mlx-community/helium-1-preview-2b`
|
14 |
+
- **Fine-Tuned Model:** `J.O.S.I.E.v6-2b`
|
15 |
+
- **Context length:** 4098
|
16 |
+
- **Trained number of Tokens:** ca. 1T
|
17 |
+
- **Created by:** Gökdeniz Gülmez
|
18 |
+
- **Fine-Tune Dataset:** Offline private dataset
|
19 |
+
- **DPO Dataset:** Offline private dataset
|
20 |
+
- **Prompt Template:**
|
21 |
+
|
22 |
+
```text
|
23 |
+
<|im_start|>system
|
24 |
+
You are Josie my private, super-intelligent assistant.<|im_end|>
|
25 |
+
<|im_start|>Gökdeniz Gülmez
|
26 |
+
{{ .PROMPT }}<|im_end|>
|
27 |
+
<|im_start|>Josie
|
28 |
+
{{ .RESPONSE }}<|im_end|>
|
29 |
+
```
|
30 |
+
|
31 |
+
- **Training Process:**
|
32 |
+
- First **10K steps** trained using **LoRA** (Low-Rank Adaptation) with **18 layers** selected.
|
33 |
+
- Final **1K steps** trained using **full weight training**.
|
34 |
+
|
35 |
+
## Hardware Used
|
36 |
+
|
37 |
+
- **Device:** Apple Mac Mini M4 (32GB RAM)
|
38 |
+
- **Framework:** Apple MLX-LM
|
39 |
+
|
40 |
+
## Notes & Limitations
|
41 |
+
|
42 |
+
- This is an experimental setup; performance and efficiency optimizations are ongoing.
|
43 |
+
- Dataset details remain private and are not included in this repository.
|
44 |
+
- The training process may require significant memory and computational resources despite optimizations.
|
45 |
+
- Further work is needed to explore distributed training and mixed-precision techniques for better performance on Apple Silicon.
|
46 |
+
|
47 |
+
## DPO Training
|
48 |
+
|
49 |
+
DPO training is not yet available in the official `mlx-examples` repository. To use it, you will need to clone and work from my fork:
|
50 |
+
[https://github.com/Goekdeniz-Guelmez/mlx-examples.git](https://github.com/Goekdeniz-Guelmez/mlx-examples.git)
|
51 |
+
|
52 |
+
## Future Improvements
|
53 |
+
|
54 |
+
- Experiment with additional quantization techniques to reduce VRAM usage.
|
55 |
+
- Investigate performance scaling across multiple Apple Silicon devices.
|
56 |
+
- Optimize training pipelines for better convergence and efficiency.
|
57 |
+
|
58 |
+
## Community Feedback
|
59 |
+
|
60 |
+
I would love to hear from the MLX community! Should I publish a tutorial on how to fine-tune LLMs on Apple Silicon? If so, would you prefer it in text or video format? Let me know!
|
61 |
+
|
62 |
+
## Disclaimer
|
63 |
+
|
64 |
+
This project is strictly for research and experimental purposes. The fine-tuned model is not intended for production use at this stage.
|
65 |
+
|
66 |
+
Best:
|
67 |
+
Gökdeniz Gülmez
|