Goekdeniz-Guelmez/Josie-v6-2b-mlx-concept

Overview

This is a crude proof of concept (PoC) demonstrating the feasibility of fine-tuning a large language model (LLM) on Apple Silicon using the MLX-LM framework. The goal is to explore the capabilities of Apple’s hardware for local LLM training and fine-tuning workflows.

Model and Training Details

  • Base Model: mlx-community/helium-1-preview-2b

  • Fine-Tuned Model: J.O.S.I.E.v6-2b

  • Context length: 4098

  • Trained number of Tokens: ca. 1T

  • Created by: Gökdeniz Gülmez

  • Fine-Tune Dataset: Offline private dataset

  • DPO/ORPO Dataset: Offline private dataset

  • Prompt Template:

    <|im_start|>system
    You are Josie my private, super-intelligent assistant.<|im_end|>
    <|im_start|>Gökdeniz Gülmez
    {{ .PROMPT }}<|im_end|>
    <|im_start|>Josie
    {{ .RESPONSE }}<|im_end|>
    
  • Training Process:

    • First 10K steps trained using LoRA (Low-Rank Adaptation) with 22 layers selected.
    • Second 1K steps trained using full weight training.
    • Final 4K steps ORPO training using DoRA with 22 layers selected.

Hardware Used

  • Device: Apple Mac Mini M4 (32GB RAM)
  • Framework: Apple MLX-LM

Quantisations

Notes & Limitations

  • This is an experimental setup; performance and efficiency optimizations are ongoing.
  • Dataset details remain private and are not included in this repository.
  • The training process may require significant memory and computational resources despite optimizations.
  • Further work is needed to explore distributed training and mixed-precision techniques for better performance on Apple Silicon.

ORPO Training

ORPO training is not yet available in the official mlx-examples repository. To use it, you will need to clone and work from my fork: https://github.com/Goekdeniz-Guelmez/mlx-examples.git

Future Improvements

  • Experiment with additional quantization techniques to reduce VRAM usage.
  • Investigate performance scaling across multiple Apple Silicon devices.
  • Optimize training pipelines for better convergence and efficiency.

Community Feedback

I would love to hear from the MLX community! Should I publish a tutorial on how to fine-tune LLMs on Apple Silicon? If so, would you prefer it in text or video format? Let me know!

Disclaimer

This project is strictly for research and experimental purposes. The fine-tuned model is not intended for production use at this stage.

Best, Gökdeniz Gülmez

Downloads last month
3
Safetensors
Model size
2.17B params
Tensor type
FP16
·
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support mlx models with pipeline type text-generation

Model tree for Goekdeniz-Guelmez/Josie-v6-2b-mlx-concept