SaborDay's picture
Update README.md
fcd1643 verified
|
raw
history blame
2.32 kB
---
library_name: transformers
language:
- en
base_model: microsoft/phi-2
pipeline_tag: text-generation
---
https://arxiv.org/abs/1710.06071
# Model Card for Model ID
![](ft_sections.png){width=50%}
This is a small language model designed for scientific research. It specializes in analyzing clinical trial abstracts and sorts sentences into four key sections: Background, Methods, Results, and Conclusion.
This makes it easier and faster for researchers to understand and organize important information from clinical studies.
## Model Details
- **Developed by: Salvatore Saporito
- **Language(s) (NLP):** English
- **Finetuned from model:** https://huggingface.co/microsoft/phi-2
### Model Sources [optional]
- **Repository:** Coming soon
## Uses
Automatic identification of sections in (clinical trial) abstracts.
## How to Get Started with the Model
Prompt Format:
'''
###Unstruct:
{abstract}
###Struct:
'''
## Training Details
### Training Data
50k randomly sampled randomized clinical trial abstracts with date of pubblication within [1970-2023].
Abstracts were retrieved from MEDLINE using Biopython.
### Training Procedure
Generation of (unstructured, structured) pairs for structured abstracts.
Generation of dedicated prompt for Causal_LM modelling.
#### Training Hyperparameters
bnb_config = BitsAndBytesConfig(load_in_4bit=True,
bnb_4bit_quant_type='nf4',
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True)
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
10k randomly sampled RCT abstract within period [1970-2023]
#### Metrics
### Results
#### Summary
## Technical Specifications [optional]
### Model Architecture and Objective
LoraConfig(
r=16,
lora_alpha=32,
target_modules=[
'q_proj','k_proj','v_proj','dense','fc1','fc2'],
bias="none",
lora_dropout=0.05,
task_type="CAUSAL_LM",
)
### Compute Infrastructure
#### Hardware
1 x RTX4090 - 24 GB
#### Software
torch einops transformers bitsandbytes accelerate peft
## Model Card Contact