Technonia
/

Llama-3-8B-DPO-orca-NEFTune

Inference Endpoints

Model card Files Files and versions Community

Llama-3-8B-DPO-orca-NEFTune / README.md

BlackBeenie's picture

Update README.md

86c51cc 10 months ago

|

history blame contribute delete

665 Bytes

	---
	library_name: transformers
	tags:
	- trl
	- dpo
	---

	# Model Card for Model ID

	## Model Details

	Finetune Llama-3-8B model with Orca-DPO dataset.

	## Training Details

	### Training Data

	Trained on Orca dataset (DPO).

	### Training Procedure

	Add NEFTune module for robustness, and fine-tune the model with DPO trainer.

	#### Training Hyperparameters

	- lora_alpha = 16
	- lora_r = 64
	- lora_dropout = 0.1
	- adam_beta1 = 0.9
	- adam_beta2 = 0.999
	- weight_decay = 0.001
	- max_grad_norm = 0.3
	- learning_rate = 2e-4
	- bnb_4bit_quant_type = nf4
	- optim = "paged_adamw_32bit"
	- optimizer_type = "paged_adamw_32bit"
	- max_steps = 5000
	- gradient_accumulation_steps = 4