bioformer-cased-v1.0 fined-tuned on the MNLI dataset for 2 epochs.
The fine-tuning process was performed on two NVIDIA GeForce GTX 1080 Ti GPUs (11GB). The parameters are:
max_seq_length=512
per_device_train_batch_size=16
total train batch size (w. parallel, distributed & accumulation) = 32
learning_rate=3e-5
Evaluation results
eval_accuracy = 0.803973