adamo1139
/

Yi-34b-200K-rawrr-v2-run-0902-LoRA

Model card Files Files and versions Community

Yi-34b-200K-rawrr-v2-run-0902-LoRA / README.md

adamo1139's picture

Update README.md

3bc6a67 verified 8 months ago

|

history blame contribute delete

1.02 kB

	---
	license: apache-2.0
	tags:
	- lora
	- qlora
	- adapter
	---
	This is not an instruct fine tune, instead it's an attempt to de-contaminate the model, remove gptslop and refusals. I want model to feel like it was trained on human data, not synthetic one.


	About 961 steps total, Yi-34B-200K llamafied DPO trained for 1 epoch on rawrr_v2 dataset via unsloth qlora at prompt length of 400 and max length of 700, lr 0.000045 \
	Model initialized with max_positional_embeddings of 4096 to not OOM. \
	Training done on RTX 3090 Ti in about 14 hours. \
	Average mem usage was like 23.89 / 23.99 GiB, so very close to OOM at all times. \
	I trained it with XFCE on one 1080p monitor loaded up, on more fancy DM it would probably OOM with the same setup. \
	I am not sure what's the purpose of max_prompt_length being separate from max_length, so I may have used it wrong, I should read up on it. \
	Script I used to do this fine-tune is in the repo. I used chatml prompt format. Now I plan to fine-tune this on AEZAKMI v3 dataset soon.