Experiment 1 SFT ALPACA INDO
dataset: 9 millions token indo alpaca dataset
max_seq_length = 8192, dataset_num_proc = 2, packing = False, args = TrainingArguments( per_device_train_batch_size = 1, gradient_accumulation_steps = 8, warmup_steps = 5, num_train_epochs = 1, learning_rate = 5e-5, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407,
- Downloads last month
- 32