JosefAlbers
/

Phi-3-vision-128k-instruct-mlx

Model card Files Files and versions Community

JosefAlbers commited on Jun 16

Commit

0d1c555

•

1 Parent(s): 00619d6

Update README.md

Files changed (1) hide show

README.md +26 -3

README.md CHANGED Viewed

@@ -9,7 +9,8 @@ tags:
 - llm
 - phi
 ---
-# Phi-3-Vision VLM Model for Apple MLX: An All-in-One Port
 This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offering a comprehensive solution for various text and image processing tasks. With a focus on simplicity and efficiency, this implementation offers a straightforward and minimalistic integration of the VLM model. It seamlessly incorporates essential functionalities such as generating quantized model weights, optimizing KV cache quantization during inference, facilitating LoRA/QLoRA training, and conducting model benchmarking, all encapsulated within a single file for convenient access and usage.
@@ -27,6 +28,28 @@ This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offe
 ## Quick Start
 ### **VLM Agent** (WIP)
 VLM's understanding of both text and visuals enables interactive generation and modification of plots/images, opening up new possibilities for GUI development and data visualization.
@@ -189,7 +212,7 @@ Generation: 8.56 tokens-per-sec (100 tokens / 11.6 sec)
 ### **LoRA Testing** (WIP)
 ```python
-# from phi_3_vision_mlx import recall
 test_lora(dataset_path="JosefAlbers/akemiH_MedQA_Reason"):
 ```
@@ -321,4 +344,4 @@ This project is licensed under the [MIT License](LICENSE).
 ## Citation
-<a href="https://zenodo.org/doi/10.5281/zenodo.11403221"><img src="https://zenodo.org/badge/806709541.svg" alt="DOI"></a>

 - llm
 - phi
 ---
+# Phi-3-Vision for Apple MLX
 This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offering a comprehensive solution for various text and image processing tasks. With a focus on simplicity and efficiency, this implementation offers a straightforward and minimalistic integration of the VLM model. It seamlessly incorporates essential functionalities such as generating quantized model weights, optimizing KV cache quantization during inference, facilitating LoRA/QLoRA training, and conducting model benchmarking, all encapsulated within a single file for convenient access and usage.
 ## Quick Start
+**1. Install Phi-3 Vision MLX:**
+```bash
+git clone https://github.com/JosefAlbers/Phi-3-Vision-MLX.git
+```
+**2. Launch Phi-3 Vision MLX:**
+```bash
+phi3v
+```
+Or,
+```python
+from phi_3_vision_mlx import chatui
+chatui()
+```
+## Usage
 ### **VLM Agent** (WIP)
 VLM's understanding of both text and visuals enables interactive generation and modification of plots/images, opening up new possibilities for GUI development and data visualization.
 ### **LoRA Testing** (WIP)
 ```python
+# from phi_3_vision_mlx import test_lora
 test_lora(dataset_path="JosefAlbers/akemiH_MedQA_Reason"):
 ```
 ## Citation
+<a href="https://zenodo.org/doi/10.5281/zenodo.11403221"><img src="https://zenodo.org/badge/806709541.svg" alt="DOI"></a>