Lack of generalisability and bias to predicting FAKE
Have you tried this model with real examples like "There is war between Ukraine and Russia", "Barack Obama was USA president", "Israel is at war in Gaza", "COVID is a virus" etc. I found when testing with real world news statements it makes wrong predictions. Please correct me if I am wrong. Almost it classified given sentences as FAKE. I found the same issue on Kaggle notebooks which claimed more than 97% accuracy which used ROBERTA, BERT and other models and finetuned with LIAR, fake-real etc. datasets.These are some kaggle examples Fake vs Real News Detection | BERT π€ | Acc: 100%, Fake News Detector: EDA & Prediction(99+%), News classification 97%-f1 mlflow pytorch dagshub, Fake-News Cleaning+Word2Vec+LSTM (99% Accuracy), Fake News Classification (Easiest 99% accuracy), True and Fake News || LSTM accuracy:97.90%