--- license: apache-2.0 base_model: lukeleeai/t5-base_cola_densedense_baseline tags: - generated_from_trainer datasets: - glue metrics: - accuracy model-index: - name: t5-base_cola_dense_mare_mlp_einsum results: - task: name: Text Classification type: text-classification dataset: name: glue type: glue config: cola split: validation args: cola metrics: - name: Accuracy type: accuracy value: 0.7526366251198466 --- # t5-base_cola_dense_mare_mlp_einsum This model is a fine-tuned version of [lukeleeai/t5-base_cola_densedense_baseline](https://huggingface.co/lukeleeai/t5-base_cola_densedense_baseline) on the glue dataset. It achieves the following results on the evaluation set: - Loss: 0.6369 - Accuracy: 0.7526 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 32 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - gradient_accumulation_steps: 2 - total_train_batch_size: 32 - total_eval_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 20 - num_epochs: 8 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:--------:| | 0.5857 | 0.19 | 50 | 0.6136 | 0.6913 | | 0.5918 | 0.37 | 100 | 0.6222 | 0.6913 | | 0.5688 | 0.56 | 150 | 0.6246 | 0.6913 | | 0.685 | 0.75 | 200 | 0.6150 | 0.6913 | | 0.565 | 0.93 | 250 | 0.6197 | 0.6913 | | 0.5892 | 1.12 | 300 | 0.6066 | 0.6922 | | 0.5444 | 1.31 | 350 | 0.5988 | 0.7009 | | 0.6097 | 1.5 | 400 | 0.5796 | 0.7076 | | 0.5904 | 1.68 | 450 | 0.5916 | 0.6884 | | 0.5898 | 1.87 | 500 | 0.5815 | 0.7057 | | 0.5569 | 2.06 | 550 | 0.5771 | 0.6999 | | 0.4553 | 2.24 | 600 | 0.6217 | 0.7210 | | 0.4796 | 2.43 | 650 | 0.6323 | 0.7229 | | 0.5362 | 2.62 | 700 | 0.6491 | 0.7229 | | 0.5756 | 2.8 | 750 | 0.5745 | 0.7018 | | 0.5731 | 2.99 | 800 | 0.6104 | 0.7315 | | 0.4573 | 3.18 | 850 | 0.6087 | 0.7248 | | 0.5395 | 3.36 | 900 | 0.6768 | 0.7459 | | 0.4447 | 3.55 | 950 | 0.6372 | 0.7383 | | 0.3891 | 3.74 | 1000 | 0.6589 | 0.7402 | | 0.3923 | 3.93 | 1050 | 0.6273 | 0.7344 | | 0.3855 | 4.11 | 1100 | 0.7189 | 0.7344 | | 0.4015 | 4.3 | 1150 | 0.6456 | 0.7469 | | 0.33 | 4.49 | 1200 | 0.7179 | 0.7450 | | 0.354 | 4.67 | 1250 | 0.6369 | 0.7526 | ### Framework versions - Transformers 4.33.2 - Pytorch 2.0.1+cu117 - Datasets 2.9.0 - Tokenizers 0.11.6