Skip to main content

Advertisement

Log in

Cross-dataset COVID-19 transfer learning with data augmentation

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

This paper presents a novel cross-dataset transfer learning approach for cough-based COVID-19 detection, enhancing model performance through data augmentation. Our methodology significantly improves results compared to baseline methods. An ablation study highlights the importance of alpha mixup among various hyperparameters in optimizing performance. The final model achieves an unweighted accuracy of 88.19%. Additionally, we provide a comparative summary with previous studies on the same evaluation set to offer insights into cough-based detection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The datasets generated and/or analyzed during the current study are obtained in the following schemes: Coswara: This dataset is available at the Coswara-Data repository: https://github.com/iiscleap/Coswara-Data. We used commit ID 401b516 during the study. COUGHVID: This dataset is available at the COUGHVID repository: https://zenodo.org/records/7024894. We used version 3.0 during the study. ComParE-CCS: This ComParE-CCS dataset is not publicly available. This is the dataset provided by the Computational Paralinguistic Challenge (ComParE) 2022 COVID-19 Cough sub-challenge organizer for their challenge. Please contact the authors [31] to obtain the dataset.

References

  1. Hoda MN (2022) Editorial. Int J Inf Technol (Singapore) 14(7):3287–3290. https://doi.org/10.1007/s41870-022-01134-1

    Article  MATH  Google Scholar 

  2. Yamin M (2020) Counting the cost of COVID-19. Int J Inf Technol (Singapore) 12(2):311–317. https://doi.org/10.1007/s41870-020-00466-0

    Article  MATH  Google Scholar 

  3. Milling M, Pokorny FB, Bartl-Pokorny KD, Schuller BW (2022) Is speech the new blood? Recent progress in AI-based disease detection from audio in a nutshell. Front Digit Health 4(May):1–7. https://doi.org/10.3389/fdgth.2022.886615

    Article  MATH  Google Scholar 

  4. Gupta R, Chaspari T, Kim J, Kumar N, Bone D, Narayanan S (2016) Pathological speech processing: State-of-the-art, current challenges, and future directions. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6470–6474. IEEE, Shanghai . https://doi.org/10.1109/ICASSP.2016.7472923

  5. Pramono RXA, Imtiaz SA, Rodriguez-Villegas E (2016) A cough-based algorithm for automatic diagnosis of pertussis. PLoS ONE 11(9):1–20. https://doi.org/10.1371/journal.pone.0162128

    Article  MATH  Google Scholar 

  6. Al-khassaweneh M, Abdelrahman RB (2013) A signal processing approach for the diagnosis of asthma from cough sounds. J Med Eng Technol 37(3):165–171. https://doi.org/10.3109/03091902.2012.758322

    Article  MATH  Google Scholar 

  7. Swarnkar V, Abeyratne UR, Chang AB, Amrulloh YA, Setyati A, Triasih R (2013) Automatic identification of wet and dry cough in pediatric patients with respiratory diseases. Ann Biomed Eng 41(5):1016–1028. https://doi.org/10.1007/s10439-013-0741-6

    Article  Google Scholar 

  8. Bertini F, Allevi D, Lutero G, Calzà L, Montesi D (2021) An automatic Alzheimer’s disease classifier based on spontaneous spoken English. Comput Speech Lang 72:101298. https://doi.org/10.1016/j.csl.2021.101298

    Article  Google Scholar 

  9. Shuvo SB, Ali SN, Swapnil SI, Hasan T, Bhuiyan MIH (2021) A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram. IEEE J Biomed Health Inf 25(7):2595–2603. https://doi.org/10.1109/JBHI.2020.3048006. arXiv:2009.04402

    Article  Google Scholar 

  10. Han J, Xia T, Spathis D, Bondareva E, Brown C, Chauhan J, Dang T, Grammenos A, Hasthanasombat A, Floto A, Cicuta P, Mascolo C (2022) Sounds of COVID-19: exploring realistic performance of audio-based digital testing. Npj Digit Med 5(1):16. https://doi.org/10.1038/s41746-021-00553-x. arXiv:2106.15523

    Article  Google Scholar 

  11. Anthes E (2020) Alexa, do I have COVID-19? Nature 586(7827):22–25. https://doi.org/10.1038/d41586-020-02732-4

    Article  Google Scholar 

  12. Adjuik TA, Ananey-Obiri D (2022) Word2vec neural model-based techniqueto generate protein vectors for combating COVID-19: a machine learning approach. Int J Inf Technol (Singapore) 14(7):3291–3299. https://doi.org/10.1007/s41870-022-00949-2

    Article  MATH  Google Scholar 

  13. Khanday AMUD, Rabani ST, Khan QR, Rouf N, Mohi Ud Din M (2020) Machine learning based approaches for detecting COVID-19 using clinical text data. Int J Inf Technol (Singapore) 12(3):731–739. https://doi.org/10.1007/s41870-020-00495-9

    Article  Google Scholar 

  14. Singh D, Singh BK, Behera AK (2023) A real-time correlation model between lung sounds & clinical data for asthmatic patients. Int J Inf Technol 15(1):39–44. https://doi.org/10.1007/s41870-022-01138-x

    Article  MATH  Google Scholar 

  15. Quatieri TF, Talkar T, Palmer JS (2020) A framework for biomarkers of COVID-19 based on coordination of speech-production subsystems. IEEE Open J Eng Med Biol 1:203–206. https://doi.org/10.1109/OJEMB.2020.2998051

    Article  MATH  Google Scholar 

  16. Islam R, Abdel-Raheem E, Tarique M (2022) A study of using cough sounds and deep neural networks for the early detection of Covid-19. Biomed Eng Adv 3:100025. https://doi.org/10.1016/j.bea.2022.100025

    Article  MATH  Google Scholar 

  17. Vahedian-azimi A, Keramatfar A, Asiaee M, Atashi SS, Nourbakhsh M (2021) Do you have COVID-19? An artificial intelligence-based screening tool for COVID-19 using acoustic parameters. J Acoust Soc Am 150(3):1945–1953. https://doi.org/10.1121/10.0006104

    Article  Google Scholar 

  18. Bartl-Pokorny KD, Pokorny FB, Batliner A, Amiriparian S, Semertzidou A, Eyben F, Kramer E, Schmidt F, Schönweiler R, Wehler M, Schuller BW (2021) The voice of COVID-19: acoustic correlates of infection in sustained vowels. J Acoust Soc Am 149(6):4377–4383. https://doi.org/10.1121/10.0005194

    Article  Google Scholar 

  19. Hamidi M, Zealouk O, Satori H, Laaidi N, Salek A (2023) COVID-19 assessment using HMM cough recognition system. Int J Inf Technol 15(1):193–201. https://doi.org/10.1007/s41870-022-01120-7

    Article  Google Scholar 

  20. Hasan I, Dhawan P, Rizvi SAM, Dhir S (2023) Data analytics and knowledge management approach for COVID-19 prediction and control. Int J Inf Technol (Singapore) 15(2):937–954. https://doi.org/10.1007/s41870-022-00967-0

    Article  Google Scholar 

  21. Mohammed EA, Keyhani M, Sanati-Nezhad A, Hejazi SH, Far BH (2021) An ensemble learning approach to digital corona virus preliminary screening from cough sounds. Sci Rep 11(1):1–11. https://doi.org/10.1038/s41598-021-95042-2

    Article  Google Scholar 

  22. Chowdhury NK, Kabir MA, Rahman MM, Islam SMS (2022) Machine learning for detecting COVID-19 from cough sounds: an ensemble-based MCDM method. Comput Biol Med 145(March):105405. https://doi.org/10.1016/j.compbiomed.2022.105405

    Article  MATH  Google Scholar 

  23. Casanova E, Candido Jr, A, Fernandes Jr, RC, Finger M, Gris LRS, Ponti MA, Pinto da Silva DP (2021) Transfer learning and data augmentation techniques to the COVID-19 identification tasks in ComParE 2021. In: Interspeech 2021, pp. 446–450. ISCA, ISCA https://doi.org/10.21437/Interspeech.2021-1798. https://www.isca-speech.org/archive/interspeech_2021/casanova21_interspeech.html

  24. Sharma G, Umapathy K, Krishnan S (2022) Audio texture analysis of COVID-19 cough, breath, and speech sounds. Biomed Signal Process Control. https://doi.org/10.1016/j.bspc.2022.103703

    Article  MATH  Google Scholar 

  25. Atmaja BT, Zanjabila Suyanto Sasou A (2023) Comparing hysteresis comparator and RMS threshold methods for automatic single cough segmentations. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01626-8

    Article  MATH  Google Scholar 

  26. Suyanto Z, Atmaja BT, Asmoro WA (2024) Performance improvement of Covid-19 cough detection based on deep learning with segmentation methods. J Appl Data Sci 5(2):520–531

    Article  Google Scholar 

  27. Wang CC, Pan CA, Hung JW (2008) Silence feature normalization for robust speech recognition in additive noise environments. In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH, pp 1028–1031

  28. Atmaja BT, Akagi M (2020) The effect of silence feature in dimensional speech emotion recognition. In: 10th international conference on speech prosody 2020, pp 26–30. ISCA, Tokyo. https://doi.org/10.21437/SpeechProsody.2020-6

  29. Orlandic L, Teijeiro T, Atienza D (2021) The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms. Sci Data 8(1):156. https://doi.org/10.1038/s41597-021-00937-4

    Article  Google Scholar 

  30. Haritaoglu ED, Rasmussen N, Tan DCH, J, JR, Xiao J, Chaudhari G, Rajput A, Govindan P, Canham C, Chen W, Yamaura M, Gomezjurado L, Broukhim A, Khanzada A, Pilanci M (2022) Using deep learning with large aggregated datasets for COVID-19 classification from cough, 1–10 arXiv:2201.01669

  31. Schuller BW, Batliner A, Bergler C, Mascolo C, Han J, Lefter I, Kaya H, Amiriparian S, Baird A, Stappen L, Ottl S, Gerczuk M, Tzirakis P, Brown C, Chauhan J, Grammenos A, Hasthanasombat A, Spathis D, Xia T, Cicuta P, Rothkrantz LJM, Zwerts JA, Treep J, Kaandorp CS (2021) The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates. In: Interspeech 2021, pp 431–435. ISCA, ISCA . https://doi.org/10.21437/Interspeech.2021-19

  32. Sharma N, Krishnan P, Kumar R, Ramoji S, Chetupalli SR, Nirmala R, Kumar Ghosh P, Ganapathy S (2020) Coswara - A database of breathing, cough, and voice sounds for COVID-19 diagnosis. In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH 2020-Octob, 4811–4815 arXiv:2005.10548. https://doi.org/10.21437/Interspeech.2020-2768

  33. McFee B, Lostanlen V, McVicar M, Metsai A, Balke S, Thomé C, Raffel C, Malek A, Lee D, Zalkow F, Lee K, Nieto O, Mason J, Ellis D, Yamamoto R, Seyfarth S, Battenberg E, Morozov V, Bittner R, Choi K, Moore J, Wei Z, Hidaka S, Nullmightybofo Friesch P, Stöter F-R, Hereñú D, Kim T, Vollrath M, Weiss A (2020) librosa/librosa: 0.7.2 . https://doi.org/10.5281/ZENODO.3606573

  34. Guo J, Sainath TN, Weiss RJ (2019) A Spelling Correction Model for End-to-end Speech Recognition. In: ICASSP 2019 - 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5651–5655. IEEE, Brighton, UK. https://doi.org/10.1109/ICASSP.2019.8683745

  35. Choi K, Wang Y (2021) Listen, Read, and Identify: Multimodal Singing Language Identification. In: Proc Ofthe 22nd Int Society for Music Information Retrieval Conf, pp 121–127

  36. Liu Z-T, Xiao P, Li D-Y, Hao M (2019) Speaker-Independent Speech Emotion Recognition Based on CNN-BLSTM and Multiple SVMs. In: International conference on intelligent robotics and applications, pp 481–491

  37. Choi K, Fazekas G, Sandler M, Cho K (2018) A comparison of audio signal preprocessing methods for deep neural networks on music tagging. In: 2018 26th European signal processing conference (EUSIPCO), pp 1870–1874. IEEE, Rome, Italy. https://doi.org/10.23919/EUSIPCO.2018.8553106

  38. Yang Y-Y, Hira M, Ni Z, Astafurov A, Chen C, Puhrsch C, Pollack D, Genzel D, Greenberg D, Yang EZ, Lian J, Hwang J, Chen J, Goldsborough P, Narenthiran S, Watanabe S, Chintala S, Quenneville-Belair V (2022) Torchaudio: building blocks for audio and speech processing. In: ICASSP 2022 - 2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), vol 2022-May, pp 6982–6986. IEEE, https://doi.org/10.1109/ICASSP43922.2022.9747236

  39. Kong Q, Cao Y, Iqbal T, Wang Y, Wang W, Plumbley MD (2020) PANNs: large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Trans Audio Speech Lang Process 28(1):2880–2894. https://doi.org/10.1109/TASLP.2020.3030497. arXiv:1912.10211

    Article  MATH  Google Scholar 

  40. Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. In: 7th International conference on learning representations, ICLR arXiv:1711.05101

  41. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system arXiv:1603.02754. https://doi.org/10.1145/2939672.2939785

  42. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) MixUp: beyond empirical risk minimization. In: 6th international conference on learning representations, ICLR 2018 - Conference Track Proceedings, pp 1–13

  43. Park DS, Chan W, Zhang Y, Chiu CC, Zoph B, Cubuk ED, Le QV (2019) Specaugment: a simple data augmentation method for automatic speech recognition. In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH, vol 2019-Septe, pp 2613–2617. https://doi.org/10.21437/Interspeech.2019-2680

  44. Snyder D, Chen G, Povey D (2015) MUSAN: a music, speech, and noise corpus arXiv:1510.08484

  45. Halevy A, Norvig P, Pereira F (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24(2):8–12. https://doi.org/10.1109/MIS.2009.36

    Article  MATH  Google Scholar 

  46. Goodfellow I, Bengio Y, Courville A (2015) Deep Learning Book. MIT Press, Cambridge

    MATH  Google Scholar 

  47. Atmaja BT, Sasou A (2022) Effects of data augmentations on speech emotion recognition. Sensors 22(16):5941. https://doi.org/10.3390/s22165941

    Article  MATH  Google Scholar 

  48. Coppock H, Akman A, Bergler C, Gerczuk M, Brown C, Chauhan J, Grammenos A, Hasthanasombat A, Spathis D, Xia T, Cicuta P, Han J, Amiriparian S, Baird A, Stappen L, Ottl S, Tzirakis P, Batliner A, Mascolo C, Schuller BW (2023) A summary of the ComParE COVID-19 challenges. Front Digit Health 5:1–2. https://doi.org/10.3389/fdgth.2023.1058163. arXiv:2202.08981

    Article  Google Scholar 

  49. Illium S, Müller R, Sedlmeier A, Popien CL (2021) Visual transformers for primates classification and covid detection. Proc Ann Conf Int Speech Commun Assoc INTERSPEECH 6:4341–4345. https://doi.org/10.21437/Interspeech.2021-273

    Article  MATH  Google Scholar 

  50. Solera-Ureña R, Botelho C, Teixeira F, Rolland T, Abad A, Trancoso I (2021) Transfer learning-based cough representations for automatic detection of COVID-19. Proc Ann Conf Int Speech Commun Assoc INTERSPEECH 6:4336–4340. https://doi.org/10.21437/Interspeech.2021-1702

    Article  Google Scholar 

Download references

Acknowledgements

B.T.A. and A.S. are supported by the New Energy and Industrial Technology Development Organization (NEDO) Japan Project No. JPNP20006 and JSPS KAKENHI Grant Number 24K02967. Z., S., and W. A. A. are supported by project number 1014/PKS/ITS/2022, funded by the Directorate of Research and Community Service, Sepuluh Nopember Institute of Technology (ITS), Indonesia. The authors would like to thank Dr. Dhany Arifianto of VibrasticLab ITS for allowing us to use his computational resources for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bagus Tris Atmaja.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no Conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Atmaja, B.T., Zanjabila, Suyanto et al. Cross-dataset COVID-19 transfer learning with data augmentation. Int. j. inf. tecnol. (2025). https://doi.org/10.1007/s41870-025-02433-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41870-025-02433-z

Keywords