HuatuoGPT-o1-7B-exl2
Original model: HuatuoGPT-o1-7B made by FreedomIntelligence
Based on: Qwen2.5-7B-Instruct by Qwen
Quants
4bpw h6 (main)
4.5bpw h6
5bpw h6
6bpw h6
8bpw h8
Quantization notes
Made with Exllamav2 0.2.7 with default dataset.
Exl2 quants require Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
Model has to fully fit GPU as RAM offloading isn't supported natively.
It can be used with apps such as TabbyAPI, Text-Generation-WebUI, LoLLMs and others.
Original model card
HuatuoGPT-o1-7B
Introduction
HuatuoGPT-o1 is a medical LLM designed for advanced medical reasoning. It generates a complex thought process, reflecting and refining its reasoning, before providing a final response.
For more information, visit our GitHub repository: https://github.com/FreedomIntelligence/HuatuoGPT-o1.
Model Info
Backbone | Supported Languages | Link | |
---|---|---|---|
HuatuoGPT-o1-8B | LLaMA-3.1-8B | English | HF Link |
HuatuoGPT-o1-70B | LLaMA-3.1-70B | English | HF Link |
HuatuoGPT-o1-7B | Qwen2.5-7B | English & Chinese | HF Link |
HuatuoGPT-o1-72B | Qwen2.5-72B | English & Chinese | HF Link |
Usage
You can use HuatuoGPT-o1-7B in the same way as Qwen2.5-7B-Instruct
. You can deploy it with tools like vllm or Sglang, or perform direct inference:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("FreedomIntelligence/HuatuoGPT-o1-7B",torch_dtype="auto",device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("FreedomIntelligence/HuatuoGPT-o1-7B")
input_text = "How to stop a cough?"
messages = [{"role": "user", "content": input_text}]
inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False,add_generation_prompt=True
), return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
HuatuoGPT-o1 adopts a thinks-before-it-answers approach, with outputs formatted as:
## Thinking
[Reasoning process]
## Final Response
[Output]
π Citation
@misc{chen2024huatuogpto1medicalcomplexreasoning,
title={HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs},
author={Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang and Jianye Hou and Benyou Wang},
year={2024},
eprint={2412.18925},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.18925},
}
- Downloads last month
- 9
Model tree for cgus/HuatuoGPT-o1-7B-exl2
Base model
Qwen/Qwen2.5-7B