BlackBeenie's picture
Update README.md
86c51cc
---
library_name: transformers
tags:
- trl
- dpo
---
# Model Card for Model ID
## Model Details
Finetune Llama-3-8B model with Orca-DPO dataset.
## Training Details
### Training Data
Trained on Orca dataset (DPO).
### Training Procedure
Add NEFTune module for robustness, and fine-tune the model with DPO trainer.
#### Training Hyperparameters
- lora_alpha = 16
- lora_r = 64
- lora_dropout = 0.1
- adam_beta1 = 0.9
- adam_beta2 = 0.999
- weight_decay = 0.001
- max_grad_norm = 0.3
- learning_rate = 2e-4
- bnb_4bit_quant_type = nf4
- optim = "paged_adamw_32bit"
- optimizer_type = "paged_adamw_32bit"
- max_steps = 5000
- gradient_accumulation_steps = 4