cgus
/

Text Generation
English
Chinese
qwen2
medical
conversational
4-bit precision
exl2

HuatuoGPT-o1-7B-exl2

Original model: HuatuoGPT-o1-7B made by FreedomIntelligence
Based on: Qwen2.5-7B-Instruct by Qwen

Quants

4bpw h6 (main)
4.5bpw h6
5bpw h6
6bpw h6
8bpw h8

Quantization notes

Made with Exllamav2 0.2.7 with default dataset.
Exl2 quants require Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
Model has to fully fit GPU as RAM offloading isn't supported natively.
It can be used with apps such as TabbyAPI, Text-Generation-WebUI, LoLLMs and others.

Original model card

HuatuoGPT-o1-7B

Introduction

HuatuoGPT-o1 is a medical LLM designed for advanced medical reasoning. It generates a complex thought process, reflecting and refining its reasoning, before providing a final response.

For more information, visit our GitHub repository: https://github.com/FreedomIntelligence/HuatuoGPT-o1.

Model Info

Backbone Supported Languages Link
HuatuoGPT-o1-8B LLaMA-3.1-8B English HF Link
HuatuoGPT-o1-70B LLaMA-3.1-70B English HF Link
HuatuoGPT-o1-7B Qwen2.5-7B English & Chinese HF Link
HuatuoGPT-o1-72B Qwen2.5-72B English & Chinese HF Link

Usage

You can use HuatuoGPT-o1-7B in the same way as Qwen2.5-7B-Instruct. You can deploy it with tools like vllm or Sglang, or perform direct inference:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("FreedomIntelligence/HuatuoGPT-o1-7B",torch_dtype="auto",device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("FreedomIntelligence/HuatuoGPT-o1-7B")

input_text = "How to stop a cough?"
messages = [{"role": "user", "content": input_text}]

inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False,add_generation_prompt=True
), return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

HuatuoGPT-o1 adopts a thinks-before-it-answers approach, with outputs formatted as:

## Thinking
[Reasoning process]

## Final Response
[Output]

πŸ“– Citation

@misc{chen2024huatuogpto1medicalcomplexreasoning,
      title={HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs}, 
      author={Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang and Jianye Hou and Benyou Wang},
      year={2024},
      eprint={2412.18925},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2412.18925}, 
}
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for cgus/HuatuoGPT-o1-7B-exl2

Base model

Qwen/Qwen2.5-7B
Quantized
(15)
this model

Datasets used to train cgus/HuatuoGPT-o1-7B-exl2