anicolson commited on
Commit
9d4087e
·
verified ·
1 Parent(s): 4e95a0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md CHANGED
@@ -85,6 +85,48 @@ for i,j in zip(findings, impression):
85
  print(f'Findings:\t{i}\nImpression:\t{j}\n\n')
86
  ```
87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  ## MIMIC-CXR & MIMIC-IV-ED dataset:
89
 
90
  MIMIC-CXR, MIMIC-CXR-JPG, and MIMIC-IV-ED must be in the same Physio Net directory. E.g.:
 
85
  print(f'Findings:\t{i}\nImpression:\t{j}\n\n')
86
  ```
87
 
88
+ #### Inference for a study | no emergency data | no Hugging Face Datasets
89
+ ```python
90
+ import torch
91
+ import transformers
92
+ from torchvision.io import read_image
93
+
94
+ # Modules:
95
+ model = transformers.AutoModelForCausalLM.from_pretrained('aehrc/cxrmate-ed', trust_remote_code=True).to(device=device)
96
+ tokenizer = transformers.PreTrainedTokenizerFast.from_pretrained('aehrc/cxrmate-ed')
97
+
98
+ study_image_paths = ['...', '...']
99
+
100
+ indication = '...'
101
+ history = '...'
102
+
103
+ images = [read_image(i) for i in img_path_list_idx]
104
+ images = [torch.stack([model.test_transforms(i) for i in images])]
105
+ images = torch.nn.utils.rnn.pad_sequence(images, batch_first=True, padding_value=0.0).to(device=device)
106
+ image_time_deltas = [[model.zero_time_delta_value] * images.shape[1]]
107
+
108
+ # Convert the patient data in the batch into embeddings:
109
+ inputs_embeds, attention_mask, token_type_ids, position_ids, bos_token_ids = model.prepare_inputs(
110
+ tokenizer=tokenizer, images=images, image_time_deltas=image_time_deltas, study_id=[0], indication=[[indication]], history=[[history]]
111
+ )
112
+
113
+ # Generate reports:
114
+ output_ids = model.generate(
115
+ input_ids=bos_token_ids,
116
+ decoder_inputs_embeds=inputs_embeds,
117
+ decoder_token_type_ids=token_type_ids,
118
+ prompt_attention_mask=attention_mask,
119
+ prompt_position_ids=position_ids,
120
+ special_token_ids=[tokenizer.sep_token_id],
121
+ max_length=256,
122
+ num_beams=4,
123
+ return_dict_in_generate=True,
124
+ )['sequences']
125
+
126
+ # Findings and impression section:
127
+ findings, impression = model.split_and_decode_sections(output_ids, [tokenizer.sep_token_id, tokenizer.eos_token_id], tokenizer)
128
+ ```
129
+
130
  ## MIMIC-CXR & MIMIC-IV-ED dataset:
131
 
132
  MIMIC-CXR, MIMIC-CXR-JPG, and MIMIC-IV-ED must be in the same Physio Net directory. E.g.: