Initial Upload

Browse files

Files changed (4) hide show

README.md +153 -0
adapter_config.json +22 -0
adapter_model.bin +3 -0
img.png +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,153 @@

+---
+language:
+- en
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- medical
+license: cc-by-nc-3.0
+---
+# MedFalcon v2.1a 40b LoRA - Step 4500
+![img.png](img.png)
+## Model Description
+This a model check point release at 4500 steps. For evaluation use only! Limitations:
+* LoRA output will be more concise than the base model
+* Due to the size, base knowledge may be overwritten from falcon-40b
+* Due to the size, more hardware may be required to load falcon-40b when using this LoRA
+### Architecture
+`nmitchko/medfalconv2-1a-40b-lora'` is a large language model LoRa specifically fine-tuned for medical domain tasks.
+It is based on [`Falcon-40b`](https://huggingface.co/tiiuae/falcon-40b) at 40 billion parameters.
+The primary goal of this model is to improve question-answering and medical dialogue tasks.
+It was trained using [LoRA](https://arxiv.org/abs/2106.09685), specifically [QLora](https://github.com/artidoro/qlora), to reduce memory footprint.
+See Training Parameters for more info  This Lora supports 4-bit and 8-bit modes.
+### Requirements
+```
+bitsandbytes>=0.39.0
+peft
+transformers
+```
+Steps to load this model:
+1. Load base model using transformers
+2. Apply LoRA using peft
+```python
+#
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+from peft import PeftModel
+model = "tiiuae/falcon-40b"
+LoRA = "nmitchko/medfalconv2-1a-40b-lora"
+# If you want 8 or 4 bit set the appropriate flags
+load_8bit = True
+tokenizer = AutoTokenizer.from_pretrained(model)
+model = AutoModelForCausalLM.from_pretrained(model,
+    load_in_8bit=load_8bit,
+    torch_dtype=torch.float16,
+    trust_remote_code=True,
+)
+model = PeftModel.from_pretrained(model, LoRA)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+    device_map="auto",
+)
+sequences = pipeline(
+   "What does the drug ceftrioxone do?\nDoctor:",
+    max_length=200,
+    do_sample=True,
+    top_k=40,
+    num_return_sequences=1,
+    eos_token_id=tokenizer.eos_token_id,
+)
+for seq in sequences:
+    print(f"Result: {seq['generated_text']}")
+```
+## Training Parameters
+The model was trained for 4500 steps or 1 epoch on a custom, unreleased dataset named `medconcat`.
+`medconcat` contains only human generated content and weighs in at over 100MiB of raw text.
+The below bash script initiated training in `4bit` mode for a rather large LoRA:
+| Item          | Amount | Units |
+|---------------|--------|-------|
+| LoRA Rank     | 128    | ~     |
+| LoRA Alpha    | 256    | ~     |
+| Learning Rate | 1e-3   | SI    |
+| Dropout       | 5      | %     |
+```bash
+CURRENTDATEONLY=`date +"%b %d %Y"`
+sudo nvidia-smi -i 1 -pl 250
+export CUDA_VISIBLE_DEVICES=0
+nohup python qlora.py \
+    --model_name_or_path models/tiiuae_falcon-40b \
+    --output_dir ./loras/medfalcon2.1a-40b \
+    --logging_steps 100 \
+    --save_strategy steps \
+    --data_seed 42 \
+    --save_steps 200 \
+    --save_total_limit 40 \
+    --evaluation_strategy steps \
+    --eval_dataset_size 1024 \
+    --max_eval_samples 1000 \
+    --per_device_eval_batch_size 1 \
+    --max_new_tokens 32 \
+    --dataloader_num_workers 3 \
+    --group_by_length \
+    --logging_strategy steps \
+    --remove_unused_columns False \
+    --do_train \
+    --lora_r 128 \
+    --lora_alpha 256 \
+    --lora_modules all \
+    --double_quant \
+    --quant_type nf4 \
+    --bf16 \
+    --bits 4 \
+    --warmup_ratio 0.03 \
+    --lr_scheduler_type constant \
+    --gradient_checkpointing \
+    --dataset="training/datasets/medconcat/" \
+    --dataset_format alpaca \
+    --trust_remote_code=True \
+    --source_max_len 16 \
+    --target_max_len 512 \
+    --per_device_train_batch_size 1 \
+    --gradient_accumulation_steps 16 \
+    --max_steps 4500 \
+    --eval_steps 1000 \
+    --learning_rate 0.0001 \
+    --adam_beta2 0.999 \
+    --max_grad_norm 0.3 \
+    --lora_dropout 0.05 \
+    --weight_decay 0.0 \
+    --seed 0 > "${CURRENTDATEONLY}-finetune-medfalcon2.1a.log" &
+```

adapter_config.json ADDED Viewed

	@@ -0,0 +1,22 @@

+{
+  "base_model_name_or_path": "models/tiiuae_falcon-40b",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 256.0,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 128,
+  "revision": null,
+  "target_modules": [
+    "dense",
+    "dense_h_to_4h",
+    "dense_4h_to_h",
+    "query_key_value"
+  ],
+  "task_type": "CAUSAL_LM"
+}

adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:53b6fb6a01432b56bb4d74e4eb593919a7950f6e28b633fc96c93c7b2c3ad0c0
+size 1777513610

img.png ADDED Viewed