File size: 4,059 Bytes
cc70246 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
language:
- en
library_name: peft
pipeline_tag: text-generation
tags:
- medical
license: cc-by-nc-3.0
---
# MedFalcon v2.1a 40b LoRA - Step 4500
![img.png](img.png)
## Model Description
This a model check point release at 4500 steps. For evaluation use only! Limitations:
* LoRA output will be more concise than the base model
* Due to the size, base knowledge may be overwritten from falcon-40b
* Due to the size, more hardware may be required to load falcon-40b when using this LoRA
### Architecture
`nmitchko/medfalconv2-1a-40b-lora'` is a large language model LoRa specifically fine-tuned for medical domain tasks.
It is based on [`Falcon-40b`](https://huggingface.co/tiiuae/falcon-40b) at 40 billion parameters.
The primary goal of this model is to improve question-answering and medical dialogue tasks.
It was trained using [LoRA](https://arxiv.org/abs/2106.09685), specifically [QLora](https://github.com/artidoro/qlora), to reduce memory footprint.
See Training Parameters for more info This Lora supports 4-bit and 8-bit modes.
### Requirements
```
bitsandbytes>=0.39.0
peft
transformers
```
Steps to load this model:
1. Load base model using transformers
2. Apply LoRA using peft
```python
#
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
from peft import PeftModel
model = "tiiuae/falcon-40b"
LoRA = "nmitchko/medfalconv2-1a-40b-lora"
# If you want 8 or 4 bit set the appropriate flags
load_8bit = True
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model,
load_in_8bit=load_8bit,
torch_dtype=torch.float16,
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, LoRA)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
sequences = pipeline(
"What does the drug ceftrioxone do?\nDoctor:",
max_length=200,
do_sample=True,
top_k=40,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
```
## Training Parameters
The model was trained for 4500 steps or 1 epoch on a custom, unreleased dataset named `medconcat`.
`medconcat` contains only human generated content and weighs in at over 100MiB of raw text.
The below bash script initiated training in `4bit` mode for a rather large LoRA:
| Item | Amount | Units |
|---------------|--------|-------|
| LoRA Rank | 128 | ~ |
| LoRA Alpha | 256 | ~ |
| Learning Rate | 1e-3 | SI |
| Dropout | 5 | % |
```bash
CURRENTDATEONLY=`date +"%b %d %Y"`
sudo nvidia-smi -i 1 -pl 250
export CUDA_VISIBLE_DEVICES=0
nohup python qlora.py \
--model_name_or_path models/tiiuae_falcon-40b \
--output_dir ./loras/medfalcon2.1a-40b \
--logging_steps 100 \
--save_strategy steps \
--data_seed 42 \
--save_steps 200 \
--save_total_limit 40 \
--evaluation_strategy steps \
--eval_dataset_size 1024 \
--max_eval_samples 1000 \
--per_device_eval_batch_size 1 \
--max_new_tokens 32 \
--dataloader_num_workers 3 \
--group_by_length \
--logging_strategy steps \
--remove_unused_columns False \
--do_train \
--lora_r 128 \
--lora_alpha 256 \
--lora_modules all \
--double_quant \
--quant_type nf4 \
--bf16 \
--bits 4 \
--warmup_ratio 0.03 \
--lr_scheduler_type constant \
--gradient_checkpointing \
--dataset="training/datasets/medconcat/" \
--dataset_format alpaca \
--trust_remote_code=True \
--source_max_len 16 \
--target_max_len 512 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 16 \
--max_steps 4500 \
--eval_steps 1000 \
--learning_rate 0.0001 \
--adam_beta2 0.999 \
--max_grad_norm 0.3 \
--lora_dropout 0.05 \
--weight_decay 0.0 \
--seed 0 > "${CURRENTDATEONLY}-finetune-medfalcon2.1a.log" &
``` |