Noob question: Training with training data portion of datasets used in benchmarking?

#564
by HankN - opened

Sorry for this noob question: Is it OK to use training data portion of datasets used in benchmarking? E.g GSM8K: training split of 7.47k out of 8.79k in total https://huggingface.co/datasets/gsm8k ? Is this counted as contamination? Thanks a lot.

Open LLM Leaderboard org

Hi!
Using the training data of our evaluation sets is not contamination, as long as the training and testing sets are not contaminated between one another (= questions almost identical between one and the other, you can see examples of that on the LMSYS article, for the MATH dataset for example ).

clefourrier changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment