Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,42 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# T5-Small Fine-tuned for Clinical Summarization of FHIR Document Reference Clinical Notes
|
| 2 |
+
|
| 3 |
+
This model is a fine-tuned version of the `t5-small` model from Hugging Face, specifically tailored for the clinical summarization of FHIR Document Reference Clinical Notes.
|
| 4 |
+
|
| 5 |
+
## Model Details
|
| 6 |
+
|
| 7 |
+
- **Original Model**: [T5-Small](https://huggingface.co/t5-small)
|
| 8 |
+
- **Fine-tuned Model**: [dlyog/t5-small-finetuned](https://huggingface.co/dlyog/t5-small-finetuned/)
|
| 9 |
+
- **License**: Apache-2.0 (same as the original T5 license)
|
| 10 |
+
|
| 11 |
+
## Fine-tuning Process
|
| 12 |
+
|
| 13 |
+
The model was fine-tuned using a synthetic dataset created with tools like [Synthea](https://synthetichealth.github.io/synthea/). This dataset was used to simulate real-world clinical notes, ensuring the model understands the nuances and intricacies of medical terminology and context.
|
| 14 |
+
|
| 15 |
+
Only the last two layers of the `t5-small` model were fine-tuned to retain most of the pre-trained knowledge while adapting it for better clinical summarization.
|
| 16 |
+
|
| 17 |
+
## Usage
|
| 18 |
+
|
| 19 |
+
Using the model is straightforward with the Hugging Face Transformers library:
|
| 20 |
+
|
| 21 |
+
```python
|
| 22 |
+
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
| 23 |
+
|
| 24 |
+
model = T5ForConditionalGeneration.from_pretrained("dlyog/t5-small-finetuned")
|
| 25 |
+
tokenizer = T5Tokenizer.from_pretrained("dlyog/t5-small-finetuned")
|
| 26 |
+
|
| 27 |
+
def summarize(text):
|
| 28 |
+
input_text = "summarize: " + text
|
| 29 |
+
input_ids = tokenizer.encode(input_text, return_tensors="pt")
|
| 30 |
+
summary_ids = model.generate(input_ids)
|
| 31 |
+
summary = tokenizer.decode(summary_ids[0])
|
| 32 |
+
return summary
|
| 33 |
+
|
| 34 |
+
# Example
|
| 35 |
+
text = "Your clinical note here..."
|
| 36 |
+
print(summarize(text))
|
| 37 |
+
|
| 38 |
+
# Acknowledgements
|
| 39 |
+
A big thanks to the creators of the original t5-small model and the Hugging Face community. Also, gratitude to tools like Synthea that enabled the creation of high-quality synthetic datasets for fine-tuning purposes.
|
| 40 |
+
|
| 41 |
+
# License
|
| 42 |
+
This model is licensed under the Apache-2.0 License, the same as the original T5 model.
|