Comparing numbers

Summary

	PolitiFact Acc	GossipCop Acc
Paper 1 (TBA)	0.8632**	0.8388**
Paper 2 (2019)	0.691	0.822
Paper 3 (2020)	0.8	0.82
Paper 4 (2023)	0.584	-
Paper 5 (2020)	0.846	0.86
Paper 6 (2021)	0.9156*	0.9156*

* Combined dataset.
** Not updated below. (FakeNewsNet-politifact_max-vocab=15000_pre=v2_text_domain_T=200_s=15 and FakeNewsNet-gossipcop_max-vocab=15000_pre=v2_domain_T=250_s=5)

Paper 1: This paper

Model	PolitiFact				GossipCop				PolitiFact+GossipCop
Model	Acc	Prec	Rec	F1	Acc	Prec	Rec	F1	Acc	Prec	Rec	F1
TM+Text	0.750	0.742	0.761	0.743	0.778	0.695	0.743	0.693	0.742	0.686	0.740	0.694
TM+Domain	0.768	0.849	0.779	0.762	0.732	0.666	0.708	0.675	0.734	0.672	0.714	0.681
TM+Text+Domain	0.764	0.754	0.770	0.756	0.809	0.739	0.775	0.744	0.782	0.721	0.766	0.734
TM+Tweet	0.632	0.688	0.514	0.387	0.241	0.121	0.500	0.194	-	-	-	-
TM+Tweet+Text	0.774	0.764	0.781	0.766	0.799	0.724	0.737	0.716	-	-	-	-
TM+Tweet+Domain	0.750	0.840	0.764	0.745	0.749	0.672	0.702	0.681	-	-	-	-
TM+Tweet+Text+Domain	0.788	0.776	0.792	0.780	0.788	0.724	0.771	0.738	-	-	-	-

Method

80/20 split
Tokenize
Convert text to lowercase
Convert text to a binary bag-of-words vector

Size

PolitiFact has a total of 1056
GossipCop has a total of 22140

Paper 2: FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media (2019)

PDF

Model	PolitiFact				GossipCop
Model	Acc	Prec	Rec	F1	Acc	Prec	Rec	F1
SVM	0.580	0.611	0.717	0.659	0.470	0.462	0.451	0.456
Logic regression	0.642	0.757	0.543	0.633	0.822	0.897	0.722	0.799
Naive Bayes	0.617	0.674	0.630	0.651	0.704	0.735	0.765	0.798
CNN	0.629	0.807	0.456	0.583	0.703	0.789	0.623	0.699
Social Article Fusion /S	0.654	0.600	0.789	0.681	0.741	0.709	0.761	0.734
Social Article Fusion /A	0.667	0.667	0.579	0.619	0.796	0.782	0.743	0.762
Social Article Fusion	0.691	0.638	0.789	0.706	0.796	0.820	0.753	0.785

Method

80/20 split
Unknown or "default" settings

Size

PolitiFact has a total of 1056 (624 real, 432 fake)
GossipCop has a total of 22865 (16817 real, 6048 fake)

Paper 3: Exploring N-gram, Word Embedding and Topic Models for Content-based Fake News Detection in FakeNewsNet Evaluation (2020)

PDF

Model	PolitiFact				GossipCop
Model	Acc	Prec	Rec	F1	Acc	Prec	Rec	F1
LogReg	0.64	0.76	0.54	0.63	0.82	0.90	0.72	0.80
Social Article Fusion	0.69	0.64	0.79	0.71	0.80	0.82	0.75	0.79
N-gram	0.80	0.79	0.78	0.78	0.82	0.75	0.79	0.77
Topic	0.60	0.55	0.53	0.51	0.51	0.51	0.51	0.47
Word2Vec	0.73	0.73	0.74	0.73	0.78	0.71	0.76	0.72
N-gram + Topic	0.77	0.76	0.76	0.76	0.82	0.75	0.78	0.76
N-gram + Word2Vec	0.72	0.72	0.73	0.72	0.78	0.71	0.76	0.72
Topic + Word2Vec	0.42	0.49	0.49	0.39	0.63	0.60	0.64	0.60
N-gram + Topic + Word2Vec	0.40	0.45	0.48	0.36	0.58	0.57	0.60	0.54

Method

80/20 split
Tokenize
Stemming
Remove duplicates
Remove punctuation
Remove special characters and symbols
Remove hash from hashtags
Remove stop words
Convert text to lowercase

Size

After preprocessing:

PolitiFact has a total of 968 (426 real, 542 fake)
GossipCop has a total of 20796 (4804 real, 15965 fake)

Paper 4: Machine Learning vs Deep Learning Models for Detecting Fake News: A Comparative Analysis on Fake-NewsNet Dataset (2023)

PDF

Note: this is an aggregate of the presented data.

Model	PolitiFact
Model	Acc	Prec	Rec	F1
NB	0.584	0.585	0.565	0.545
SVM	0.574	0.645	0.575	0.515
LSTM	0.560	0.570	0.580	0.555

Method

80/10/10 split
Tokenize
Stemming
"Clean" punctuation
Remove some punctuation
Remove stop words
Remove numbers
Drop invalid data
Convert text to lowercase
Convert text to TF-IDF

Size

PolitiFact has a total of 1056 (624 real, 432 fake)

Paper 5: SpotFake+: A Multimodal Framework for Fake News Detection via Transfer Learning (Student Abstract) (2020)

PDF GitHub

Text

Model	PolitiFact	GossipCop
Model	Acc	Acc
SVM	0.58	0.497
Logistic Regression	0.642	0.648
Naive Bayes	0.617	0.624
CNN	0.629	0.723
SAF (Social Article Fusion)	0.691	0.689
XLNet + dense layer	0.74	0.836
XLNet + CNN	0.721	0.84
XLNet + LSTM	0.721	0.807

Text + Image

Model	PolitiFact	GossipCop
Model	Acc	Acc
EANN	0.74	0.86
MVAE	0.673	0.775
SpotFake	0.721	0.807
SpotFake+	0.846	0.856

Method

Remove logos
Drop samples without images

Size

Before preprocessing:

PolitiFact has a total of 1056 (624 real, 432 fake)
GossipCop has a total of 22140 (16817 real, 5323 fake)

After preprocessing:

PolitiFact has a total of 485 (321 real, 164 fake)
GossipCop has a total of 12840 (10259 real, 2581 fake)

Paper 6: A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles (2021)

PDF

Model	PolitiFact + GossipCop
Model	Acc	Prec	Rec	F1
FakeFlow	0.82	0.82	0.82	0.82
One-Hot LR	0.7670	0.7670	0.7670	0.7670
FakeNewsTracker	0.7186	0.7186	0.7186	0.7186
Ensemble Model + Heuristic Post-Processing	0.9007	0.9007	0.9007	0.9007
SFFN (with MCDropout) + Heuristic Post-Processing	0.9156	0.9156	0.9156	0.9156

Method

80/10/10 split
For tweets tweet-preprocessor was used (a Python package) to filter out usernames, URLs, emojis, etc.
For articles, filter out: usernames, URLs from Instagram, Facebook, Twitter, etc.
Different tokenizers (from huggingface)
Vocabulary trained on a large corpus like GLUE, wikitext-103, CommonCrawl, etc.
Transfer learning
News body was crawled
Ensemble of models used for balancing

Size

PolitiFact has a total of 1056 (624 real, 432 fake)
GossipCop has a total of 22140 (16817 real, 5323 fake)

Data was augmented by crawling "E! online", PolitiFact, and GossipCop.

After crawling:

PolitiFact has a total of 1011 (610 real, 401 fake)
GossipCop has a total of 20474 (15151 real, 5323 fake)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comparison_2023-03-27.md

comparison_2023-03-27.md

Comparing numbers

Summary

Paper 1: This paper

Method

Size

Paper 2: FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media (2019)

Method

Size

Paper 3: Exploring N-gram, Word Embedding and Topic Models for Content-based Fake News Detection in FakeNewsNet Evaluation (2020)

Method

Size

Paper 4: Machine Learning vs Deep Learning Models for Detecting Fake News: A Comparative Analysis on Fake-NewsNet Dataset (2023)

Method

Size

Paper 5: SpotFake+: A Multimodal Framework for Fake News Detection via Transfer Learning (Student Abstract) (2020)

Text

Text + Image

Method

Size

Paper 6: A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles (2021)

Method

Size

Files

comparison_2023-03-27.md

Latest commit

History

comparison_2023-03-27.md

File metadata and controls

Comparing numbers

Summary

Paper 1: This paper

Method

Size

Paper 2: FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media (2019)

Method

Size

Paper 3: Exploring N-gram, Word Embedding and Topic Models for Content-based Fake News Detection in FakeNewsNet Evaluation (2020)

Method

Size

Paper 4: Machine Learning vs Deep Learning Models for Detecting Fake News: A Comparative Analysis on Fake-NewsNet Dataset (2023)

Method

Size

Paper 5: SpotFake+: A Multimodal Framework for Fake News Detection via Transfer Learning (Student Abstract) (2020)

Text

Text + Image

Method

Size

Paper 6: A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles (2021)

Method

Size