MonteXiaofeng commited on
Commit
41e851e
1 Parent(s): f73e5ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -98
README.md CHANGED
@@ -1,99 +1,105 @@
1
- ---
2
- license: other
3
- ---
4
-
5
- ## Introduction
6
-
7
- Aquila is a large language model independently developed by BAAI. Building upon the Aquila model, we continued pre-training, SFT (Supervised Fine-Tuning), and RL (Reinforcement Learning) through a multi-stage training process, ultimately resulting in the AquilaMed-RL model. This model possesses professional capabilities in the medical field and demonstrates a significant win rate when evaluated against annotated data using the GPT-4 model. The AquilaMed-RL model can perform medical triage, medication inquiries, and general Q&A. We will open-source the SFT data and RL data required for training the model. Additionally, we will release a technical report detailing our methods in developing the model for the medical field, thereby promoting the development of the open-source community.
8
-
9
- ## Model Details
10
-
11
- The training process of the model is described as follows. For more information, please refer to our technical report. https://github.com/FlagAI-Open/industry-application/blob/main/Aquila_med_tech-report.pdf
12
-
13
- ![pipeline](./img/pipline_2.jpg)
14
-
15
- ## Evaluation
16
-
17
- The subjective and objective scores are as follows。
18
-
19
- subjective: Using GPT-4 for evaluation, the win rates of our model compared to the reference answers in the annotated validation dataset are as follows.
20
-
21
- Objective:use MMLU / C-EVAL / CMB-exam to evaluate the model
22
-
23
- ![pipeline](./img/eval-result-med.png)
24
-
25
- ## usage
26
-
27
- Once you have downloaded the model locally, you can use the following code for inference.
28
-
29
- ```python
30
-
31
- import torch
32
- from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
33
-
34
-
35
- model_dir = "xxx"
36
-
37
- tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
38
-
39
- config = AutoConfig.from_pretrained(model_dir, trust_remote_code=True)
40
- model = AutoModelForCausalLM.from_pretrained(
41
- model_dir, config=config, trust_remote_code=True
42
- )
43
- model.cuda()
44
- model.eval()
45
-
46
- template = "<|im_start|>system\nYou are a helpful assistant in medical domain.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n"
47
-
48
- text = "我肚子疼怎么办?"
49
-
50
- item_instruction = template.format(question=text)
51
-
52
- inputs = tokenizer(item_instruction, return_tensors="pt").to("cuda")
53
- input_ids = inputs["input_ids"]
54
- prompt_length = len(input_ids[0])
55
- generate_output = model.generate(
56
- input_ids=input_ids, do_sample=False, max_length=1024, return_dict_in_generate=True
57
- )
58
-
59
- response_ids = generate_output.sequences[0][prompt_length:]
60
- predicts = tokenizer.decode(
61
- response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
62
- )
63
-
64
- print("predict:", predicts)
65
-
66
-
67
- """
68
- predict: 肚子疼可能是多种原因引起的,例如消化不良、胃炎、胃溃疡、胆囊炎、胰腺炎、肠道感染等。如果疼痛持续或加重,或者伴随有呕吐、腹泻、发热等症状,建议尽快就医。如果疼痛轻微,可以尝试以下方法缓解:
69
-
70
- 1. 饮食调整:避免油腻、辛辣、刺激性食物,多喝水,多吃易消化的食物,如米粥、面条、饼干等。
71
-
72
- 2. 休息:避免剧烈运动,保持充足的睡眠。
73
-
74
- 3. 热敷:用热水袋或毛巾敷在肚子上,可以缓解疼痛。
75
-
76
- 4. 药物:可以尝试一些非处方药,如布洛芬、阿司匹林等,但请务必在医生的指导下使用。
77
-
78
- 如果疼痛持续或加重,或者伴随有其他症状,建议尽快就医。
79
-
80
- 希望我的回答对您有所帮助。如果您还有其他问题,欢迎随时向我提问。
81
- """
82
- ```
83
-
84
- ## License
85
-
86
- Aquila series open-source model is licensed under [BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaMed-RL/blob/main/BAAI-Aquila-Model-License%20-Agreement.pdf)
87
-
88
-
89
-
90
- ## Citation
91
-
92
- If you find our work helpful, feel free to give us a cite.
93
-
94
- ```
95
- @article{Aqulia-Med LLM,
96
- title={Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models},
97
- year={2024}
98
- }
 
 
 
 
 
 
99
  ```
 
1
+ ---
2
+ license: other
3
+ ---
4
+
5
+ ## Introduction
6
+
7
+ Aquila is a large language model independently developed by BAAI. Building upon the Aquila model, we continued pre-training, SFT (Supervised Fine-Tuning), and RL (Reinforcement Learning) through a multi-stage training process, ultimately resulting in the AquilaMed-RL model. This model possesses professional capabilities in the medical field and demonstrates a significant win rate when evaluated against annotated data using the GPT-4 model. The AquilaMed-RL model can perform medical triage, medication inquiries, and general Q&A. We will open-source the SFT data and RL data required for training the model. Additionally, we will release a technical report detailing our methods in developing the model for the medical field, thereby promoting the development of the open-source community.
8
+
9
+ ## Model Details
10
+
11
+ The training process of the model is described as follows. For more information, please refer to our technical report. https://github.com/FlagAI-Open/industry-application/blob/main/Aquila_med_tech-report.pdf
12
+
13
+ ![pipeline](./img/pipline_2.jpg)
14
+
15
+
16
+ ## Dataset
17
+ we have released our supervised data, you can find the in huggingface
18
+ - SFT: https://huggingface.co/datasets/BAAI/AquilaMed-Instruct
19
+ - RL: https://huggingface.co/datasets/BAAI/AquilaMed-RL
20
+
21
+ ## Evaluation
22
+
23
+ The subjective and objective scores are as follows。
24
+
25
+ subjective: Using GPT-4 for evaluation, the win rates of our model compared to the reference answers in the annotated validation dataset are as follows.
26
+
27
+ Objective:use MMLU / C-EVAL / CMB-exam to evaluate the model
28
+
29
+ ![pipeline](./img/eval-result-med.png)
30
+
31
+ ## usage
32
+
33
+ Once you have downloaded the model locally, you can use the following code for inference.
34
+
35
+ ```python
36
+
37
+ import torch
38
+ from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
39
+
40
+
41
+ model_dir = "xxx"
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
44
+
45
+ config = AutoConfig.from_pretrained(model_dir, trust_remote_code=True)
46
+ model = AutoModelForCausalLM.from_pretrained(
47
+ model_dir, config=config, trust_remote_code=True
48
+ )
49
+ model.cuda()
50
+ model.eval()
51
+
52
+ template = "<|im_start|>system\nYou are a helpful assistant in medical domain.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n"
53
+
54
+ text = "我肚子疼怎么办?"
55
+
56
+ item_instruction = template.format(question=text)
57
+
58
+ inputs = tokenizer(item_instruction, return_tensors="pt").to("cuda")
59
+ input_ids = inputs["input_ids"]
60
+ prompt_length = len(input_ids[0])
61
+ generate_output = model.generate(
62
+ input_ids=input_ids, do_sample=False, max_length=1024, return_dict_in_generate=True
63
+ )
64
+
65
+ response_ids = generate_output.sequences[0][prompt_length:]
66
+ predicts = tokenizer.decode(
67
+ response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
68
+ )
69
+
70
+ print("predict:", predicts)
71
+
72
+
73
+ """
74
+ predict: 肚子疼可能是多种原因引起的,例如消化不良、胃炎、胃溃疡、胆囊炎、胰腺炎、肠道感染等。如果疼痛持续或加重,或者伴随有呕吐、腹泻、发热等症状,建议尽快就医。如果疼痛轻微,可以尝试以下方法缓解:
75
+
76
+ 1. 饮食调整:避免油腻、辛辣、刺激性食物,多喝水,多吃易消化的食物,如米粥、面条、饼干等。
77
+
78
+ 2. 休息:避免剧烈运动,保持充足的睡眠。
79
+
80
+ 3. 热敷:用热水袋或毛巾敷在肚子上,可以缓解疼痛。
81
+
82
+ 4. 药物:可以尝试一些非处方药,如布洛芬、阿司匹林等,但请务必在医生的指导下使用。
83
+
84
+ 如果疼痛持续或加重,或者伴随有其他症状,建议尽快就医。
85
+
86
+ 希望我的回答对您有所帮助。如果您还有其他问题,欢迎随时向我提问。
87
+ """
88
+ ```
89
+
90
+ ## License
91
+
92
+ Aquila series open-source model is licensed under [BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaMed-RL/blob/main/BAAI-Aquila-Model-License%20-Agreement.pdf)
93
+
94
+
95
+
96
+ ## Citation
97
+
98
+ If you find our work helpful, feel free to give us a cite.
99
+
100
+ ```
101
+ @article{Aqulia-Med LLM,
102
+ title={Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models},
103
+ year={2024}
104
+ }
105
  ```