Fine-tuning dataset
What is the fine-tuning dataset used for this model? Thanks.
I would caution against using models that are trained on these datasets as they are inherently biased. While they perform well under holdout or cross-validation conditions, they struggle to generalise effectively outside of their training datasets (even when testing within the same domain).
Hoy, N. and Koulouri, T., 2022, December. Exploring the generalisability of fake news detection models. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 5731-5740). IEEE.
Have you tried this model with real examples like There is war between Ukraine and Russia, I found when testing with real world news statements it makes wrong predictions. Please correct me if I am wrong. Almost it classified given sentences as FAKE.
I would caution against using models that are trained on these datasets as they are inherently biased. While they perform well under holdout or cross-validation conditions, they struggle to generalise effectively outside of their training datasets (even when testing within the same domain).
Hoy, N. and Koulouri, T., 2022, December. Exploring the generalisability of fake news detection models. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 5731-5740). IEEE.
Yeah I agree with this 100%, I am also searching for solution for this. I found the same issue on Kaggle notebooks which claimed more than 97% accuracy which used ROBERTA, BERT and other models and finetuned with LIAR, fake-real etc. datasets.