Aura-4B-EXL2 / README.md

Create README.md

844c969 verified 13 days ago

7.85 kB

	---
	license: apache-2.0
	datasets:
	- Mielikki/Erebus-87k
	- FourOhFour/Instruct_Phase
	- FourOhFour/RP_Phase
	- anthracite-core/full-opus-chosen-hermes-rejected-kto-v1
	language:
	- en
	base_model:
	- IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml
	---
	---
	### These are EXL2 quants for Aura-4B, Measurement file in the main branch, Check revisions for different BPW
	---
	## Aura-4B

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/jT4LeWC0ioarPieWtNZkE.png)

	## Introduction

	Aura-4B is a state of the art dedicated roleplaying model designed to fulfill your every desire.

	This finetune has seen several hundreds of millions of tokens of completion, instruction and roleplaying data. A Kahneman-Tversky Optimization was applied to give this model a unique output style.

	Developed by Aura Industries, with contributions from Anthracite Org

	## Model Details

	- Model Name: Aura-4B
	- Base Model: [IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml](https://huggingface.co/IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml)
	- Model Type: Chat Completions
	- Prompt Format: ChatML
	- License: Apache-2.0
	- Language: English
	- Max Context: 8,192+ tokens

	## License

	This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

	## Quantizations

	[Static GGUF](https://huggingface.co/mradermacher/Aura-4B-GGUF)

	[Imatrix GGUF](https://huggingface.co/mradermacher/Aura-4B-i1-GGUF)

	EXL2 coming soon...

	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)

	Coming soon...

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \| N/A\|
	\|IFEval (0-Shot) \| N/A\|
	\|BBH (3-Shot) \| N/A\|
	\|MATH Lvl 5 (4-Shot)\| N/A\|
	\|GPQA (0-shot) \| N/A\|
	\|MuSR (0-shot) \| N/A\|
	\|MMLU-PRO (5-shot) \| N/A\|

	## Training Configuration

	<details><summary>Click here for Axolotl configs</summary>

	Completion SFT

	```yaml
	base_model: IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	hub_model_id: jeiku/completion4B
	hub_strategy: "all_checkpoints"
	push_dataset_to_hub:
	hf_use_auth_token: true

	datasets:
	- path: Mielikki/Erebus-87k
	type: completion
	field: body

	shuffle_merged_datasets: true
	val_set_size: 0.0025
	output_dir: ./outputs/out

	adapter:
	lora_r:
	lora_alpha:
	lora_dropout:
	lora_target_linear:

	sequence_len: 8192
	sample_packing: true
	eval_sample_packing: false
	pad_to_sequence_len: true

	plugins:
	- axolotl.integrations.liger.LigerPlugin
	liger_rope: true
	liger_rms_norm: true
	liger_swiglu: true
	liger_fused_linear_cross_entropy: true

	wandb_project: EXP4B
	wandb_entity:
	wandb_watch:
	wandb_name: EXP4B
	wandb_log_model:

	gradient_accumulation_steps: 12
	micro_batch_size: 3
	num_epochs: 1
	optimizer: adamw_bnb_8bit
	lr_scheduler: cosine
	learning_rate: 0.00001
	weight_decay: 0.05

	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16:
	tf32: true

	gradient_checkpointing: true
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_ratio: 0.1
	evals_per_epoch: 4
	eval_table_size:
	eval_max_new_tokens: 128
	saves_per_epoch: 1

	debug:
	deepspeed: deepspeed_configs/zero3_bf16.json
	fsdp:
	fsdp_config:

	special_tokens:
	pad_token: <\|finetune_right_pad_id\|>
	```

	Instruct SFT

	```yaml
	base_model: jeiku/completion4B
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	hub_model_id: jeiku/instructered4B
	hub_strategy: "all_checkpoints"
	push_dataset_to_hub:
	hf_use_auth_token: true

	datasets:
	- path: FourOhFour/Instruct_Phase
	type: sharegpt
	conversation: chatml

	chat_template: chatml

	shuffle_merged_datasets: true
	val_set_size: 0.0025
	output_dir: ./outputs/out

	adapter:
	lora_r:
	lora_alpha:
	lora_dropout:
	lora_target_linear:

	sequence_len: 8192
	sample_packing: true
	eval_sample_packing: false
	pad_to_sequence_len: true

	plugins:
	- axolotl.integrations.liger.LigerPlugin
	liger_rope: true
	liger_rms_norm: true
	liger_swiglu: true
	liger_fused_linear_cross_entropy: true

	wandb_project: EXP4B
	wandb_entity:
	wandb_watch:
	wandb_name: EXP4B
	wandb_log_model:

	gradient_accumulation_steps: 12
	micro_batch_size: 3
	num_epochs: 2
	optimizer: adamw_bnb_8bit
	lr_scheduler: cosine
	learning_rate: 0.00001
	weight_decay: 0.05

	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16:
	tf32: true

	gradient_checkpointing: true
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_ratio: 0.1
	evals_per_epoch: 4
	eval_table_size:
	eval_max_new_tokens: 128
	saves_per_epoch: 2

	debug:
	deepspeed: deepspeed_configs/zero3_bf16.json
	fsdp:
	fsdp_config:

	special_tokens:
	pad_token: <\|finetune_right_pad_id\|>
	```

	Roleplaying SFT

	```yaml
	base_model: jeiku/instructered4B
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	hub_model_id: jeiku/TheBest4B
	hub_strategy: "all_checkpoints"
	push_dataset_to_hub:
	hf_use_auth_token: true

	datasets:
	- path: FourOhFour/RP_Phase
	type: sharegpt
	conversation: chatml

	chat_template: chatml

	shuffle_merged_datasets: true
	val_set_size: 0.0025
	output_dir: ./outputs/out

	adapter:
	lora_r:
	lora_alpha:
	lora_dropout:
	lora_target_linear:

	sequence_len: 8192
	sample_packing: true
	eval_sample_packing: false
	pad_to_sequence_len: true

	plugins:
	- axolotl.integrations.liger.LigerPlugin
	liger_rope: true
	liger_rms_norm: true
	liger_swiglu: true
	liger_fused_linear_cross_entropy: true

	wandb_project: EXP4B
	wandb_entity:
	wandb_watch:
	wandb_name: EXP4B
	wandb_log_model:

	gradient_accumulation_steps: 12
	micro_batch_size: 3
	num_epochs: 2
	optimizer: adamw_bnb_8bit
	lr_scheduler: cosine
	learning_rate: 0.00001
	weight_decay: 0.05

	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16:
	tf32: true

	gradient_checkpointing: true
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_ratio: 0.1
	evals_per_epoch: 4
	eval_table_size:
	eval_max_new_tokens: 128
	saves_per_epoch: 2

	debug:
	deepspeed: deepspeed_configs/zero3_bf16.json
	fsdp:
	fsdp_config:

	special_tokens:
	pad_token: <\|finetune_right_pad_id\|>
	```

	KTO

	```yaml
	base_model: FourOhFour/Crispy_Crab_4B
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	hub_model_id: jeiku/aura4bkto
	hub_strategy: "all_checkpoints"
	push_dataset_to_hub:
	hf_use_auth_token: true

	chat_template: chatml

	rl: kto
	rl_beta: 0.2
	kto_desirable_weight: 0.2

	datasets:
	- path: anthracite-core/full-opus-chosen-hermes-rejected-kto-v1
	type: chatml.argilla

	shuffle_merged_datasets: true
	val_set_size: 0.0
	output_dir: ./outputs/out

	sequence_len: 8192
	sample_packing: false
	eval_sample_packing: false
	pad_to_sequence_len: false

	wandb_project: Aura-4B
	wandb_entity:
	wandb_watch:
	wandb_name: Aura-4B
	wandb_log_model:

	gradient_accumulation_steps: 16
	micro_batch_size: 2
	num_epochs: 2
	max_steps: 500

	optimizer: adamw_8bit
	lr_scheduler: cosine
	learning_rate: 0.00001
	weight_decay: 0.05

	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16:
	tf32: true

	gradient_checkpointing: true
	gradient_checkpointing_kwargs:
	use_reentrant: true
	remove_unused_columns: false
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true

	warmup_steps: 10
	evals_per_epoch: 2
	eval_table_size:
	eval_max_new_tokens:
	saves_per_epoch: 1

	debug:
	deepspeed:
	fsdp:
	fsdp_config:
	fsdp:
	fsdp_config:

	special_tokens:
	pad_token: <\|finetune_right_pad_id\|>
	```
	</details><br>