Model Card for Panacea-7B-Chat

The Panacea-7B-Chat is a foundation model for clinical trial search, summarization, design, and recruitment. It was equipped with clinical knowledge by being trained on 793,279 clinical trial design documents worldwide and 1,113,207 clinical study papers. It shows superior performances than various open-sourced LLMs and medical LLMs on clinical trial tasks.

For full details of this model please read our paper.

Model Training

Panacea is trained from Mistral-7B-v0.1. The training of Panacea consists of an alignment step and an instruction-tuning step.

  • Alignment step: continued pre-training on a large collection of trial documents and trial-related scientific papers. This step adapts Panacea to the vocabulary commonly used in clinical trials.
  • Instruction-tuning step: further enables Panacea to comprehend the user explanation of the task definition and the output requirement.

Load the model in the following way (same as Mistral):

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = 'linjc16/Panacea-7B-Chat'

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)

Citation

If you find our paper or models helpful, please consider cite as follows:

@article{lin2024panacea,
  title={Panacea: A foundation model for clinical trial search, summarization, design, and recruitment},
  author={Lin, Jiacheng and Xu, Hanwen and Wang, Zifeng and Wang, Sheng and Sun, Jimeng},
  journal={arXiv preprint arXiv:2407.11007},
  year={2024}
}
Downloads last month
193
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for linjc16/Panacea-7B-Chat

Finetuned
(802)
this model
Quantizations
1 model