mlx-community
/

Nous-Hermes-2-Mixtral-8x7B-DPO-4bit

Model card Files Files and versions Community

thomadev0 commited on Jan 17

Commit

695f3cb

•

1 Parent(s): ab34c2d

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -34,6 +34,7 @@ from mlx_lm import load, generate
 model, tokenizer = load("mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit")
 response = generate(model, tokenizer, prompt="hello", verbose=True)
 ```
 ```bash
 python3 -m mlx_lm.generate --model mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit --prompt "<|im_start|>system\nYou are an accurate, educational, and helpful information assistant<|im_end|>\n<|im_start|>user\nWhat is the difference between awq vs gptq quantitization?<|im_end|>\n<|im_start|>assistant\n" --max-tokens 2048

 model, tokenizer = load("mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit")
 response = generate(model, tokenizer, prompt="hello", verbose=True)
 ```
+## Use with mlx_lm cli
 ```bash
 python3 -m mlx_lm.generate --model mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit --prompt "<|im_start|>system\nYou are an accurate, educational, and helpful information assistant<|im_end|>\n<|im_start|>user\nWhat is the difference between awq vs gptq quantitization?<|im_end|>\n<|im_start|>assistant\n" --max-tokens 2048