Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19

Authors

DOI:

https://doi.org/10.29408/edumatic.v5i2.3664

Keywords:

Euclidean, Jaccard, K-Nearest Neighbor, Manhattan, Minkowski

Abstract

During Covid-19 pandemic, there was various hoax news about Covid-19. There are truth-clarification platforms for hoax news about Covid-19 such as Jala Hoax and Saber Hoax which categorize into misinformation and disinformation. Classification of supervised learning methods is applied to carry out learning from fact labels. Dataset is taken from Jala Hoax and Saber Hoax as many as 559 data which are made into Class 1 (Misleading Content, Satire/Parody, False Connection), Class 2 (False Context, Imposter Content), Class 3 (Fabricated and Manipulated Content). K-Nearest Neighbor (K-NN) is used to classify categories of misinformation and disinformation. Dissimilarity measure Jaccard Distance is compared with Euclidean, Manhattan, and Minkowski and uses k-value variance in the K-NN to determine the performance comparison results for each test. Results of Jaccard Distance at the value of k = 4 get a higher value than other model with an accuracy 0.696, precision 0.710, recall 0.572, and F1-Score. Maximum Results tend to be on the label of the most data class in Class 1 (Misleading Content, Satire or Parody, False Connection) with a total of 58 correct data from 61 test data.

References

Badhani, S., & Muttoo, S. K. (2019). Android Malware Detection Using Code Graphs. System Performance and Management Analytics, 203–215. https://doi.org/10.1007/978-981-10-7323-6_17

Dinata, R. K., Akbar, H., & Hasdyna, N. (2020). Algoritma K-Nearest Neighbor dengan Euclidean Distance dan Manhattan Distance untuk Klasifikasi Transportasi Bus. ILKOM Jurnal Ilmiah, 12(2), 104–111. https://doi.org/10.33096/ilkom.v12i2.539.104-111

Guillet, F., & Hamilton, H. J. (2007). Quality Measures in Data Mining. New York: Springer. https://doi.org/10.1007/978-3-540-44918-8

Jedari, E., Wu, Z., Rashidzadeh, R., & Saif, M. (2015). Wi-Fi based indoor location positioning employing random forest classifier. International Conference on Indoor Positioning and Indoor Navigation, IPIN 2015, 13–16. IEEE. https://doi.org/10.1109/IPIN.2015.7346754

Kosub, S. (2019). A note on the triangle inequality for the Jaccard distance. Pattern Recognition Letters, 120, 36–38. https://doi.org/10.1016/j.patrec.2018.12.007

Kristiawan, K., Somali, D. D., Linggan jaya, T. A., & Widjaja, A. (2020). Deteksi Buah Menggunakan Supervised Learning dan Ekstraksi Fitur untuk Pemeriksa Harga. Jurnal Teknik Informatika Dan Sistem Informasi, 6(3), 541–548. https://doi.org/10.28932/jutisi.v6i3.3029

Le, T. T. N., & Phuong, T. V. X. (2020). Privacy Preserving Jaccard Similarity by Cloud-Assisted for Classification. Wireless Personal Communications, 112(3), 1875–1892. https://doi.org/10.1007/s11277-020-07131-6

Mathur, A., Kubde, P., & Vaidya, S. (2020). Emotional analysis using twitter data during pandemic situation: Covid-19. Proceedings of the 5th International Conference on Communication and Electronics Systems, ICCES 2020, (Icces), 845–848. https://doi.org/10.1109/ICCES48766.2020.09138079

Riefky, M., & Pramesti, W. (2020). Sentiment Analysis of Southeast Asian Games (SEA Games) in Philippines 2019 Based on Opinion of Internet User of Social Media Twitter with K-Nearest Neighbor and Support Vector Machine. Jurnal Matematika, Statistika Dan Komputasi, 17(1), 26–41. https://doi.org/10.20956/jmsk.v17i1.9947

Roy, J., & Junaidi, A. (2020). Pengaruh Terpaan Media Berita Hoax di Instagram terhadap Opini Masyarakat Milenials Akan Sumber Berita. Koneksi, 4(2), 280-285. https://doi.org/10.24912/kn.v4i2.8138

Sabilla, W. I., & Putri, T. E. (2017). Prediksi Ketepatan Waktu Lulus Mahasiswa dengan k- Nearest Neighbor dan Naïve Bayes Classifier ( Studi Kasus Prodi D3 Sistem Informasi Universitas Airlangga ). Jurnal Komputer Terapan, 3(2), 233–240.

Sari, V., Firdausi, F., & Azhar, Y. (2020). Perbandingan Prediksi Kualitas Kopi Arabika dengan Menggunakan Algoritma SGD, Random Forest dan Naive Bayes. Edumatic: Jurnal Pendidikan Informatika, 4(2), 1–9. https://doi.org/10.29408/edumatic.v4i2.2202

Satrian, B., & Gusrianty. (2020). Penerapan Algoritma K-Nn untuk Klasifikasi Gamers Usia Sekolah. Jurnal Mahasiswa Aplikasi Teknologi Komputer Dan Informasi, 2(1), 19–23.

Takdirillah, R. (2020). Penerapan Data Mining Menggunakan Algoritma Apriori Terhadap Data Transaksi Sebagai Pendukung Informasi Strategi Penjualan. Edumatic: Jurnal Pendidikan Informatika, 4(1), 37–46. https://doi.org/10.29408/edumatic.v4i1.2081

Walid, M., & Darmawan, A. K. (2017). Pengenalan Ucapan Menggunakan Metode Linear Predictive Coding ( LPC ) dan K-Nearest Neighbor (KNN). Energy, Universitas Panca Marga, 7(1), 13–22.

Wang, T., Lu, K., Chow, K. P., & Zhu, Q. (2020). COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model. IEEE Access, 8, 138162–138169. https://doi.org/10.1109/ACCESS.2020.3012595

Wibawa, D. W., Nasrun, M., & Setianingsih, C. (2018). Sentiment Analysis on User Satisfaction Level of Cellular Data Service Using the K-Nearest Neighbor (K-NN) Algorithm. International Conference on Control, Electronics, Renewable Energy and Communications, ICCEREC 2018, 235–241. https://doi.org/10.1109/ICCEREC.2018.8711992

Widiyaningsih, S. D., & Pertiwi, A. (2020). Analysis of Ovo Application Sentiment Using Lexicon Based Method and K-Nearest Neighbor. Jurnal Ilmiah Ekonomi Bisnis, 25(1), 14–28. https://doi.org/10.35760/eb.2020.v25i1.2416

Downloads

Published

2021-12-20