Undi95
/

Llama-3-Chatty-2x8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3-Chatty-2x8B / README.md

Undi95's picture

Update README.md

0db91b0 verified 8 months ago

|

history blame contribute delete

3.15 kB

	---
	license: cc-by-nc-4.0
	tags:
	- merge
	---
	### Chatty-2x8B

	## Description
	After some testing, finetuning and multiple merges of Llama-3 LLM models, here is something a little different.

	This model is a MoE of 2x Llama-3 model trained on different RP format.

	This repo contains FP16 files of Chatty-2x8B.

	## The idea

	I started with two separate Llama-3-Instruct-8B models, each fine-tuned for specific RP formats.

	Here is two simple exemple of how it was trained.


	- Expert 1: This model is trained to handle RP that requires actions and descriptions between asterisks. For example:
	```
	nods Yes, I understand.
	```
	- Expert 2: This model is fine-tuned for plain text RP where characters’ dialogues and actions are described straightforwardly. For example:
	```
	Nods. "Yes, I understand."
	```

	My initial idea was to make a 11B or bigger Llama-3 model, or just make a 2x8B from existing model, but I got some issues, they were not stable enough, even after DPO and FFT on top my frankenmerge/moe of Llama-3, it was not working well enough to release them.

	So I just tried the idea of having 2 different RP format trained on 2 separated Llama-3-Instruct-8B, and it worked pretty well!

	## The dataset

	Based on Lumimaid 8B OAS success I still used the same "balance" between RP and non RP in the dataset, the maximum was 50% non RP data on each side.

	RP data was different with some exception, the non RP data was exactly the same, despite that, I can't produce repetition so the double usage of non RP datasets didn't hurt the model in the end.

	## Prompt template: Llama3

	```
	<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	{system_prompt}<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	{input}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	{output}<\|eot_id\|>
	```

	## Others

	Undi: If you want to support us, you can [here](https://ko-fi.com/undiai).

	IkariDev: Visit my [retro/neocities style website](https://ikaridevgit.github.io/) please kek


	\| Tasks \|Version\| Filter \|n-shot\| Metric \|Value \| \|Stderr\|
	\|--------------\|------:\|----------------\|-----:\|-----------\|-----:\|---\|-----:\|
	\|arc_challenge \| 1\|none \| 0\|acc \|0.5469\|± \|0.0145\|
	\| \| \|none \| 0\|acc_norm \|0.5853\|± \|0.0144\|
	\|arc_easy \| 1\|none \| 0\|acc \|0.8308\|± \|0.0077\|
	\| \| \|none \| 0\|acc_norm \|0.8258\|± \|0.0078\|
	\|gsm8k \| 3\|strict-match \| 5\|exact_match\|0.7149\|± \|0.0124\|
	\| \| \|flexible-extract\| 5\|exact_match\|0.7096\|± \|0.0125\|
	\|hellaswag \| 1\|none \| 0\|acc \|0.5945\|± \|0.0049\|
	\| \| \|none \| 0\|acc_norm \|0.7806\|± \|0.0041\|
	\|piqa \| 1\|none \| 0\|acc \|0.7943\|± \|0.0094\|
	\| \| \|none \| 0\|acc_norm \|0.7998\|± \|0.0093\|
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.5097\|± \|0.0150\|
	\|winogrande \| 1\|none \| 0\|acc \|0.7356\|± \|0.0124\|