DrSyedFaizan commited on
Commit
b03ac4f
Β·
verified Β·
1 Parent(s): 83daae7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -3
README.md CHANGED
@@ -1,3 +1,138 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ ### 🩺 Clinical Diagnosis Application & medReport Model
5
+
6
+ Welcome to the Clinical Diagnosis Application, a NLP-powered deep learning solution for automated medical diagnosis based on clinical notes. This project leverages BioBERT, Natural Language Processing, and Hugging Face Transformers to analyze patient reports and predict diseases with high accuracy.
7
+
8
+ πŸš€ Live Model Hosted on Hugging Face: DrSyedFaizan/medReport
9
+
10
+ πŸ”¬ Overview
11
+ medReport is a fine-tuned BioBERT model trained on clinical text data to predict diseases based on patient reports. The associated Clinical Diagnosis App allows users to upload medical notes (PDF/TXT) and receive disease predictions along with recommended medications and specialists.
12
+
13
+ ✨ Features
14
+ βœ… Fine-tuned BioBERT Model for medical text classification
15
+ βœ… Predict diseases from clinical notes
16
+ βœ… Extract text from PDFs and TXT files
17
+ βœ… Recommend medications & specialists based on prediction
18
+ βœ… Streamlit-powered web app for easy access
19
+ βœ… Deployable on Hugging Face Spaces / Local Server
20
+
21
+ πŸ“‚ Project Structure
22
+
23
+ πŸ“ Clinical-Diagnosis-App/
24
+ │── πŸ“‚ patient_model/ # Trained BioBERT model files
25
+ │── πŸ“‚ results/ # Model training results & logs
26
+ │── πŸ“‚ sample_data/ # Sample clinical reports
27
+ │── πŸ“œ app.py # Streamlit-based UI for predictions
28
+ │── πŸ“œ requirements.txt # Required dependencies
29
+ │── πŸ“œ README.md # Documentation
30
+ │── πŸ“œ label_encoder.pkl # Pre-trained Label Encoder
31
+ │── πŸ“œ clinical_notes.csv # Sample dataset
32
+ πŸš€ Installation & Setup
33
+ 1️⃣ Clone the Repository
34
+
35
+ git clone https://github.com/SYEDFAIZAN1987/Clinical-Diagnosis-Application-using-Natural-Language-Processing.git
36
+ cd Clinical-Diagnosis-Application-using-Natural-Language-Processing
37
+ 2️⃣ Install Dependencies
38
+
39
+ pip install -r requirements.txt
40
+ 3️⃣ Run the Applicationbash
41
+
42
+ streamlit run app.py
43
+ The app will launch at http://localhost:8501 πŸŽ‰
44
+
45
+ πŸ“Œ Model Details
46
+ The medReport model is fine-tuned on a clinical notes dataset using BioBERT, a biomedical NLP model. It has been trained for multi-label classification, allowing it to predict diseases from unstructured clinical text.
47
+
48
+ πŸ”— Load the Model
49
+ You can access the trained model directly via Hugging Face:
50
+
51
+ python
52
+ Copy
53
+ Edit
54
+ from transformers import BertForSequenceClassification, BertTokenizer
55
+ from huggingface_hub import hf_hub_download
56
+ import pickle
57
+ import torch
58
+
59
+ # Load Model & Tokenizer
60
+ model = BertForSequenceClassification.from_pretrained("DrSyedFaizan/medReport")
61
+ tokenizer = BertTokenizer.from_pretrained("DrSyedFaizan/medReport")
62
+
63
+ # Load Label Encoder
64
+ label_encoder_path = hf_hub_download(repo_id="DrSyedFaizan/medReport", filename="label_encoder.pkl")
65
+ with open(label_encoder_path, "rb") as f:
66
+ label_encoder = pickle.load(f)
67
+ πŸ“Š Performance Metrics
68
+ Metric Score
69
+ Accuracy 100$
70
+
71
+ βœ… Trained on BioBERT
72
+ βœ… Optimized with AdamW
73
+ βœ… Fine-tuned for Clinical NLP
74
+
75
+ πŸ“– Usage
76
+ πŸ”Ή Predict Disease from a Clinical Note
77
+ python
78
+ Copy
79
+ Edit
80
+ def predict_disease(text, model, tokenizer, label_encoder):
81
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
82
+ with torch.no_grad():
83
+ outputs = model(**inputs)
84
+ logits = outputs.logits
85
+ predicted_label = torch.argmax(logits, dim=1).item()
86
+ return label_encoder.inverse_transform([predicted_label])[0]
87
+ 🎨 Web App UI (Streamlit)
88
+ The Streamlit UI allows drag & drop of PDF/TXT files for quick disease predictions.
89
+
90
+ πŸ“₯ Upload Clinical Notes
91
+ 1️⃣ Upload clinical notes (PDF or TXT)
92
+ 2️⃣ Extract text from reports
93
+ 3️⃣ Predict disease
94
+ 4️⃣ Get medication & specialist recommendations
95
+
96
+ πŸ₯ Example Predictions
97
+ Clinical Note Predicted Disease Medications Specialists
98
+ "Patient reports persistent heartburn..." Gastroesophageal Reflux Disease (GERD) Omeprazole, Ranitidine Gastroenterologist
99
+ "Male patient with history of smoking, chronic cough..." Chronic Obstructive Pulmonary Disease (COPD) Tiotropium, Albuterol Pulmonologist
100
+ "Elderly patient with diabetes, experiencing numbness..." Diabetic Neuropathy Metformin, Insulin Endocrinologist
101
+ 🌍 Deployment Options
102
+ 1️⃣ Run Locally with Streamlit
103
+
104
+ bash
105
+ Copy
106
+ Edit
107
+ streamlit run app.py
108
+ 2️⃣ Deploy on Hugging Face Spaces
109
+
110
+ Create a Streamlit space on Hugging Face
111
+ Upload the repository
112
+ Add a requirements.txt file
113
+ Run app.py automatically
114
+ 3️⃣ Deploy on Cloud (AWS, GCP, Azure)
115
+
116
+ Use FastAPI + Uvicorn
117
+ Deploy via Docker / Kubernetes
118
+ πŸ› οΈ Tech Stack
119
+ βœ” BioBERT (Fine-Tuned)
120
+ βœ” Transformers (Hugging Face)
121
+ βœ” PyTorch (Deep Learning)
122
+ βœ” Streamlit (UI Framework)
123
+ βœ” Hugging Face Hub (Model Hosting)
124
+
125
+ πŸ§‘β€πŸ’» Contribution
126
+ 🀝 Contributions are welcome!
127
+ If you'd like to improve the model or app, feel free to fork the repo and submit a pull request.
128
+
129
+ Fork the repository
130
+ Clone locally
131
+ Create a branch (git checkout -b feature-new)
132
+ Commit changes (git commit -m "Added feature X")
133
+ Push & Submit a PR
134
+ πŸ“© Contact
135
+ πŸ’‘ Author: Syed Faizan, MD
136
+ πŸ“§ Email: [email protected]
137
+ πŸ€– Hugging Face: DrSyedFaizan
138
+ πŸ“‚ GitHub: SYEDFAIZAN1987