ctheodoris/Geneformer · wanted to know about Number of trials for hyperparameter tuning

fhfhede

8 days ago

•

edited 8 days ago

Firstly thanks so much for your contribution and very active support!
In MTL classifier What is the n_trials variable?

Are the trials subsetting my data for Optuna to finetune the hyperparameter's and the model's weights.
what's the difference with epoch and trials?
why is the model overfitting when given more than 1 epoch?

ctheodoris

Owner 8 days ago

Thanks for your questions.

n_trials is the number of trials for hyperparameter tuning, which is how many different combinations of hyperparameter values are tried. This is different from epoch, which is the number of times the model trains over the dataset. 1 epoch means all training data examples are shown to the model once. 2 epochs means the data is reshuffled and shown to the model a second time, and so forth. The user provides the path of the training, validation, and test data to the MTL classifier. So this is not subsetted further during the hyperparameter tuning - the splits are taken by what the user provides.

Regarding overfitting with more than 1 epoch, this is data and task dependent, but generally large models can easily memorize data so if they see an example more than once, they may overfit. That is why we generally recommend 1 epoch for fine-tuning. However, epoch is a tunable hyperparameter so certain tasks or datasets may improve with increased epochs.

ctheodoris changed discussion status to closed 8 days ago