Skip to main content

Advertisement

Log in

Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Transformer-based machine learning models have become an essential tool for many natural language processing (NLP) tasks since the introduction of the method. A common objective of these projects is to classify text data. Classification models are often extended to a different topic and/or time period. In these situations, deciding how long a classification is suitable for and when it is worth re-training our model is difficult. This paper compares different approaches to fine-tune a BERT model for a long-running classification task. We use data from different periods to fine-tune our original BERT model, and we also measure how a second round of annotation could boost the classification quality. Our corpus contains over 8 million comments on COVID-19 vaccination in Hungary posted between September 2020 and December 2021. Our results show that the best solution is using all available unlabeled comments to fine-tune a model. It is not advisable to focus only on comments containing words that our model has not encountered before; a more efficient solution is randomly sample comments from the new period. Fine-tuning does not prevent the model from losing performance but merely slows it down. In a rapidly changing linguistic environment, it is not possible to maintain model performance without regularly annotating new texts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data Availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

NLP:

Natural Language Processing

OOD:

Out-of-distribution

NER:

Named Entity Recognition

SVM:

Support Vector Machine

WHO:

World Health Organisation

References

  1. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457:1012–4. https://doi.org/10.1038/nature07634.

    Article  Google Scholar 

  2. Lazer D, Kennedy R, King G, Vespignani A. The parable of google flu: traps in big data analysis. Science. 2014;343:1203–5. https://doi.org/10.1126/science.1248506.

    Article  Google Scholar 

  3. Bayram F, Ahmed BS, Kassler A. From concept drift to model degradation: an overview on performance-aware drift detectors. Knowl-Based Syst. 2022;245:108632.

    Article  Google Scholar 

  4. Ditzler G, Roveri M, Alippi C, Polikar R. Learning in Nonstationary Environments: A Survey. IEEE Comput Intell Mag. 2015;10:12–25. https://doi.org/10.1109/MCI.2015.2471196.

    Article  Google Scholar 

  5. Iwashita AS, Papa JP. An overview on concept drift learning. IEEE Access. 2019;7:1532–47. https://doi.org/10.1109/ACCESS.2018.2886026.

    Article  Google Scholar 

  6. Žliobaitė I, Pechenizkiy M, Gama J. An Overview of Concept Drift Applications. In: Japkowicz N, Stefanowski J, editors. Big Data Analysis: New Algorithms for a New Society, vol. 16. Cham: Springer International Publishing; 2016. pp. 91–114. https://doi.org/10.1007/978-3-319-26989-4_4.

  7. Rabiu I, Salim N, Dau A, Osman A. Recommender system based on temporal models: a systematic review. Appl Sci. 2020. https://doi.org/10.3390/app10072204.

    Article  Google Scholar 

  8. Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G (2015). Credit card fraud detection and concept-drift adaptation with delayed supervised information. 2015 International Joint Conference on Neural Networks (IJCNN). Killarney Ireland IEEE https://doi.org/10.1109/IJCNN.2015.7280527.

  9. Fenza G, Gallo M, Loia V, Petrone A, Stanzione C. Concept-drift detection index based on fuzzy formal concept analysis for fake news classifiers. Technol Forecast Soc Chang. 2023. https://doi.org/10.1016/j.techfore.2023.122640.

    Article  Google Scholar 

  10. Kulkarni V, Al-Rfou R, Perozzi B and Skiena S (2015). Statistically significant detection of linguistic change. in Proceedings of the 24th international conference on world wide web 625–635.

  11. Vaswani A. et al. (2017). Attention is all you need. Advances in neural information processing systems 30.

  12. Ramponi A and Plank B (2020). Neural Unsupervised Domain Adaptation in NLP—A Survey. in The 28th International Conference on Computational Linguistics (Association for Computational Linguistics, 2020).

  13. Koh PW et al. (2021). Wilds: A benchmark of in-the-wild distribution shifts. in International Conference on Machine Learning 5637–5664 (PMLR, 2021).

  14. Larson S, Singh N, Maheshwari S, Stewart S and Krishnaswamy U (2021). Exploring Out-of-Distribution Generalization in Text Classifiers Trained on Tobacco-3482 and RVL-CDIP. in Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16 416–423 (Springer, 2021).

  15. Ma X, Xu P, Wang Z, Nallapati R and Xiang B (2019). Domain Adaptation with BERT-based Domain Classification and Data Selection. in Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019) 76–83 (Association for Computational Linguistics, 2019). https://doi.org/10.18653/v1/D19-6109.

  16. Wang J et al. (2022) Generalizing to unseen domains: a survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering.

  17. Devlin J, Chang M.-W., Lee K and Toutanova K (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. in NAACL-HLT (1) (eds. Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).

  18. Lee J, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234–40.

    Article  Google Scholar 

  19. Beltagy I, Lo K and Cohan A (2019). SciBERT: A Pretrained Language Model for Scientific Text. in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 3615–3620 (2019).

  20. Araci D (2019). FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. Preprint at http://arxiv.org/abs/1908.10063.

  21. Lazaridou A, et al. Mind the gap: assessing temporal generalization in neural language models. Adv Neural Inf Process Syst. 2021;34:29348–63.

    Google Scholar 

  22. Luu K, Khashabi D, Gururangan S, Mandyam K and Smith NA (2022). Time Waits for No One! Analysis and Challenges of Temporal Misalignment. in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 5944–5958 (Association for Computational Linguistics, 2022). https://doi.org/10.18653/v1/2022.naacl-main.435.

  23. Röttger P and Pierrehumbert J (2021). Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. in Findings of the Association for Computational Linguistics: EMNLP 2021 2400–2412 (Association for Computational Linguistics, 2021). https://doi.org/10.18653/v1/2021.findings-emnlp.206.

  24. Goel R. et al. (2016) The social dynamics of language change in online networks. in Social Informatics: 8th International Conference, SocInfo 2016, Bellevue, WA, USA, November 11–14, 2016, Proceedings, Part I 8 41–57 (Springer, 2016).

  25. Huang X. and Paul M. J (2019). Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation Models. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 4113–4123 (Association for Computational Linguistics, 2019). https://doi.org/10.18653/v1/P19-1403.

  26. Huang X. and Paul M. J (2018). Examining Temporality in Document Classification. in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 694–699 (Association for Computational Linguistics, 2018). https://doi.org/10.18653/v1/P18-2110.

  27. Agarwal O, Nenkova A. Temporal effects on pre-trained models for language processing tasks. Transact Assoc Comput Ling. 2022;10:904–21.

    Google Scholar 

  28. Rijhwani S and Preotiuc-Pietro D (2020). Temporally-Informed Analysis of Named Entity Recognition. in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 7605–7617 (Association for Computational Linguistics, 2020). https://doi.org/10.18653/v1/2020.acl-main.680.

  29. Chen S, Neves L and Solorio T (2021). Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp. in Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media.

  30. Lukes J & Søgaard, A. Sentiment analysis under temporal shift. in Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis 65–71 (Association for Computational Linguistics, 2018). https://doi.org/10.18653/v1/W18-6210.

  31. Florio K, Basile V, Polignano M, Basile P, Patti V. Time of your hate: the challenge of time in hate speech detection on social media. Appl Sci. 2020;10:4180.

    Article  Google Scholar 

  32. Nemeskey DM (2021) Introducing huBERT. in XVII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY2021) 3–14.

  33. Basile V (2020). Domain Adaptation for Text Classification with Weird Embeddings. in Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 (eds. Dell’Orletta, F., Monti, J. & Tamburini, F.) 37–43 (Accademia University Press, 2020). https://doi.org/10.4000/books.aaccademia.8250.

  34. Zsibrita J, Vincze V, Farkas R. magyarlanc: a tool for morphological and dependency parsing of hungarian. Proc Int Conf Rec Adv Nat Lang Process RANLP. 2013;2013:763–71.

    Google Scholar 

  35. Gozuacik N, Sakar CO, Ozcan S. Social media-based opinion retrieval for product analysis using multi-task deep neural networks. Expert Syst Appl. 2021;183: 115388.

    Article  Google Scholar 

  36. Fahfouh A, Riffi J, Mahraz M. A., Yahyaouy A. and Tairi H. (2022). A Contextual Relationship Model for Deceptive Opinion Spam Detection. IEEE Transactions on Neural Networks and Learning Systems.

  37. Stefkovics Á, Krekó P, Koltai J. When reality knocks on the door. The effect of conspiracy beliefs on COVID-19 vaccine acceptance and the moderating role of experience with the virus. Soc Sci Med. 2024;356:117149.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

The research was supported by the European Union within the framework of the RRF-2.3.1-21-022-00004 Artificial Intelligence National Laboratory Program. The work of Zoltán Kmetty was supported by the Bolyai Scholarship, grant number: BO/834/22.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, Z.K. and B.K.; methodology, Z.K. K.B and B.K.; formal analysis, Z.K. K.B and B.K.; data curation, K.B.; writing—original draft preparation, Z.K. and B.K.; writing—review and editing, Z.K. and B.K; visualization, Z.K.; funding acquisition, Z.K.

Corresponding author

Correspondence to Zoltán Kmetty.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Ethics Approval

As we did not include human subject in our research no ethical approval was needed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 17 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kmetty, Z., Kollányi, B. & Boros, K. Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine. SN COMPUT. SCI. 6, 13 (2025). https://doi.org/10.1007/s42979-024-03553-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03553-2

Keywords

Profiles

  1. Zoltán Kmetty