abhishek-ch commited on
Commit
2d06212
·
verified ·
1 Parent(s): b6f7ffc

updated readme

Browse files
Files changed (1) hide show
  1. README.md +59 -4
README.md CHANGED
@@ -18,25 +18,80 @@ tags:
18
  - biology
19
  - mlx
20
  datasets:
21
- - pubmed
22
  base_model:
23
  - BioMistral/BioMistral-7B
24
  - mistralai/Mistral-7B-Instruct-v0.1
25
  pipeline_tag: text-generation
26
  ---
27
-
28
  # abhishek-ch/biomistral-7b-synthetic-ehr
 
 
 
 
29
  This model was converted to MLX format from [`BioMistral/BioMistral-7B-DARE`]().
30
  Refer to the [original model card](https://huggingface.co/BioMistral/BioMistral-7B-DARE) for more details on the model.
 
 
31
  ## Use with mlx
32
 
33
  ```bash
34
  pip install mlx-lm
35
  ```
36
 
 
 
 
37
  ```python
38
- from mlx_lm import load, generate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
 
 
40
  model, tokenizer = load("abhishek-ch/biomistral-7b-synthetic-ehr")
41
- response = generate(model, tokenizer, prompt="hello", verbose=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ```
 
18
  - biology
19
  - mlx
20
  datasets:
21
+ - health_fact
22
  base_model:
23
  - BioMistral/BioMistral-7B
24
  - mistralai/Mistral-7B-Instruct-v0.1
25
  pipeline_tag: text-generation
26
  ---
 
27
  # abhishek-ch/biomistral-7b-synthetic-ehr
28
+
29
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6460910f455531c6be78b2dd/tGtYB0b3eS7A4zbqp1xz0.png)
30
+
31
+
32
  This model was converted to MLX format from [`BioMistral/BioMistral-7B-DARE`]().
33
  Refer to the [original model card](https://huggingface.co/BioMistral/BioMistral-7B-DARE) for more details on the model.
34
+
35
+
36
  ## Use with mlx
37
 
38
  ```bash
39
  pip install mlx-lm
40
  ```
41
 
42
+ The model was LoRA fine-tuned on [health_facts](https://huggingface.co/datasets/health_fact) and
43
+ Synthetic EHR dataset inspired by MIMIC-IV using the format below, for 1000 steps (~1M tokens) using mlx.
44
+
45
  ```python
46
+ def format_prompt(prompt:str, question: str) -> str:
47
+ return """<s>[INST]
48
+ ## Instructions
49
+ {}
50
+ ## User Question
51
+ {}.
52
+ [/INST]</s>
53
+ """.format(prompt, question)
54
+ ```
55
+
56
+ Example For Synthetic EHR Diagnosis System Prompt
57
+ ```
58
+ You are an expert in provide diagnosis summary based on clinical notes inspired by MIMIC-IV-Note dataset.
59
+ These notes encompass Chief Complaint along with Patient Summary & medical admission details.
60
+ ```
61
+
62
+ Example for Healthfacts Check System Prompt
63
+ ```
64
+ You are a Public Health AI Assistant. You can do the fact-checking of public health claims. \nEach answer labelled with true, false, unproven or mixture. \nPlease provide the reason behind the answer
65
+ ```
66
+
67
+ ## Loading the model using `mlx`
68
 
69
+ ```python
70
+ from mlx_lm import generate, load
71
  model, tokenizer = load("abhishek-ch/biomistral-7b-synthetic-ehr")
72
+ response = generate(
73
+ fused_model,
74
+ fused_tokenizer,
75
+ prompt=format_prompt(prompt, question),
76
+ verbose=True, # Set to True to see the prompt and response
77
+ temp=0.0,
78
+ max_tokens=512,
79
+ )
80
+ ```
81
+
82
+ ## Loading the model using `transformers`
83
+
84
+ ```python
85
+ from transformers import AutoModelForCausalLM, AutoTokenizer
86
+ repo_id = "abhishek-ch/biomistral-7b-synthetic-ehr"
87
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
88
+ model = AutoModelForCausalLM.from_pretrained(repo_id)
89
+ model.to("mps")
90
+ input_text = format_prompt(system_prompt, question)
91
+ input_ids = tokenizer(input_text, return_tensors="pt").to("mps")
92
+ outputs = model.generate(
93
+ **input_ids,
94
+ max_new_tokens=512,
95
+ )
96
+ print(tokenizer.decode(outputs[0]))
97
  ```