7a2f3e99-baf9-4b5f-a182-76bc9c863d59

This model is a fine-tuned version of katuni4ka/tiny-random-falcon-40b on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.000201
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 50
training_steps: 500

Training Loss	Epoch	Step	Validation Loss
No log	0.0002	1	11.1202
21.7433	0.0102	50	10.8623
21.628	0.0203	100	10.8000
21.5987	0.0305	150	10.7729
21.5289	0.0407	200	10.7530
21.5209	0.0509	250	10.7365
21.4654	0.0610	300	10.7261
21.4938	0.0712	350	10.7193
21.4656	0.0814	400	10.7155
21.4737	0.0916	450	10.7143
21.4702	0.1017	500	10.7143