Mr-Wick/Albert

This model is a fine-tuned version of Mr-Wick/Albert on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 16494, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
training_precision: float32

Train Loss	Train End Logits Accuracy	Train Loss Accuracy	Train Start Logits Accuracy	Validation Loss	Validation End Logits Accuracy	Validation Loss Accuracy	Validation Start Logits Accuracy	Epoch
0.6581	0.3488	0.0671	0.3529	0.9366	0.4415	0.0657	0.4486	0
0.4248	0.3423	0.0664	0.3437	0.9468	0.4724	0.0591	0.4772	1