Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Infodemiology

Date Submitted: Oct 20, 2022
Date Accepted: Feb 25, 2023

The final, peer-reviewed published version of this preprint can be found here:

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection

Sarker A, Lakamana S, Liao R, Abbas A, Yang YC, Al-Garadi M

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection

JMIR Infodemiology 2023;3:e43694

DOI: 10.2196/43694

PMID: 37113382

PMCID: 10131818

Early detection of fraudulent COVID-19 products from Twitter chatter: a dataset and a baseline approach using anomaly detection

  • Abeed Sarker; 
  • Sahithi Lakamana; 
  • Ruqi Liao; 
  • Aamir Abbas; 
  • Yuan-Chi Yang; 
  • Mohammed Al-Garadi

ABSTRACT

Background:

Social media have served as lucrative platforms for spreading misinformation and for promoting fraudulent products for the treatment, testing, and prevention of COVID-19. This has resulted in the issuance of many warning letters by the United States Food and Drug Administration (FDA). While social media continue to serve as the primary platform for the promotion of such fraudulent products, they also present the opportunity to identify these products early by employing effective social media mining methods.

Objective:

Our objectives were to (i) create a dataset of fraudulent COVID-19 products that can be used for future research, and (ii) propose a method using data from Twitter for automatically detecting heavily-promoted COVID-19 products early.

Methods:

We created a dataset from FDA-issued warnings during the early months of COVID-19. We employed natural language processing and time series anomaly detection methods for automatically detecting fraudulent COVID-19 products early from Twitter. Our approach is based on the intuition that increases in the popularity of fraudulent products lead to corresponding anomalous increases in the volume of chatter regarding them. We compared the anomaly signal generation date for each product with the corresponding FDA letter issuance date.

Results:

FDA warning issue dates ranged from March 6, 2020 to June 22, 2021, and 44 key phrases representing fraudulent products were included. From 577,872,350 posts made between February 19, 2020 to December 31, 2020, which are all publicly available, our unsupervised approach detected 34/44 (77.3%) signals about fraudulent products earlier than the FDA letter issuance dates, and an additional 6/44 (13.6%) within a week following the corresponding FDA letters.

Conclusions:

Our proposed method is simple, effective, and easy to deploy, and does not require high-performance computing machinery, unlike deep neural network-based methods. The method can be easily extended to other types of signal detection from social media data. The dataset may be used for future research and the development of more advanced methods.


 Citation

Please cite as:

Sarker A, Lakamana S, Liao R, Abbas A, Yang YC, Al-Garadi M

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection

JMIR Infodemiology 2023;3:e43694

DOI: 10.2196/43694

PMID: 37113382

PMCID: 10131818

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement