Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine

Kmetty, Zoltán; Kollányi, Bence; Boros, Krisztián

doi:10.1007/s42979-024-03553-2

Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine

Original Research
Published: 19 December 2024

Volume 6, article number 13, (2025)
Cite this article

SN Computer Science Aims and scope Submit manuscript

83 Accesses
1 Citation
5 Altmetric
Explore all metrics

Abstract

Transformer-based machine learning models have become an essential tool for many natural language processing (NLP) tasks since the introduction of the method. A common objective of these projects is to classify text data. Classification models are often extended to a different topic and/or time period. In these situations, deciding how long a classification is suitable for and when it is worth re-training our model is difficult. This paper compares different approaches to fine-tune a BERT model for a long-running classification task. We use data from different periods to fine-tune our original BERT model, and we also measure how a second round of annotation could boost the classification quality. Our corpus contains over 8 million comments on COVID-19 vaccination in Hungary posted between September 2020 and December 2021. Our results show that the best solution is using all available unlabeled comments to fine-tune a model. It is not advisable to focus only on comments containing words that our model has not encountered before; a more efficient solution is randomly sample comments from the new period. Fine-tuning does not prevent the model from losing performance but merely slows it down. In a rapidly changing linguistic environment, it is not possible to maintain model performance without regularly annotating new texts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-dimensional Classification on Social Media Data for Detailed Reporting with Large Language Models

Active Learning for Reducing Labeling Effort in Text Classification Tasks

Hierarchical Multi-label Classification of Online Vaccine Concerns

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data Availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

NLP:: Natural Language Processing
OOD:: Out-of-distribution
NER:: Named Entity Recognition
SVM:: Support Vector Machine
WHO:: World Health Organisation

References

Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457:1012–4. https://doi.org/10.1038/nature07634.
Article Google Scholar
Lazer D, Kennedy R, King G, Vespignani A. The parable of google flu: traps in big data analysis. Science. 2014;343:1203–5. https://doi.org/10.1126/science.1248506.
Article Google Scholar
Bayram F, Ahmed BS, Kassler A. From concept drift to model degradation: an overview on performance-aware drift detectors. Knowl-Based Syst. 2022;245:108632.
Article Google Scholar
Ditzler G, Roveri M, Alippi C, Polikar R. Learning in Nonstationary Environments: A Survey. IEEE Comput Intell Mag. 2015;10:12–25. https://doi.org/10.1109/MCI.2015.2471196.
Article Google Scholar
Iwashita AS, Papa JP. An overview on concept drift learning. IEEE Access. 2019;7:1532–47. https://doi.org/10.1109/ACCESS.2018.2886026.
Article Google Scholar
Žliobaitė I, Pechenizkiy M, Gama J. An Overview of Concept Drift Applications. In: Japkowicz N, Stefanowski J, editors. Big Data Analysis: New Algorithms for a New Society, vol. 16. Cham: Springer International Publishing; 2016. pp. 91–114. https://doi.org/10.1007/978-3-319-26989-4_4.
Rabiu I, Salim N, Dau A, Osman A. Recommender system based on temporal models: a systematic review. Appl Sci. 2020. https://doi.org/10.3390/app10072204.
Article Google Scholar
Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G (2015). Credit card fraud detection and concept-drift adaptation with delayed supervised information. 2015 International Joint Conference on Neural Networks (IJCNN). Killarney Ireland IEEE https://doi.org/10.1109/IJCNN.2015.7280527.
Fenza G, Gallo M, Loia V, Petrone A, Stanzione C. Concept-drift detection index based on fuzzy formal concept analysis for fake news classifiers. Technol Forecast Soc Chang. 2023. https://doi.org/10.1016/j.techfore.2023.122640.
Article Google Scholar
Kulkarni V, Al-Rfou R, Perozzi B and Skiena S (2015). Statistically significant detection of linguistic change. in Proceedings of the 24th international conference on world wide web 625–635.
Vaswani A. et al. (2017). Attention is all you need. Advances in neural information processing systems 30.
Ramponi A and Plank B (2020). Neural Unsupervised Domain Adaptation in NLP—A Survey. in The 28th International Conference on Computational Linguistics (Association for Computational Linguistics, 2020).
Koh PW et al. (2021). Wilds: A benchmark of in-the-wild distribution shifts. in International Conference on Machine Learning 5637–5664 (PMLR, 2021).
Larson S, Singh N, Maheshwari S, Stewart S and Krishnaswamy U (2021). Exploring Out-of-Distribution Generalization in Text Classifiers Trained on Tobacco-3482 and RVL-CDIP. in Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16 416–423 (Springer, 2021).
Ma X, Xu P, Wang Z, Nallapati R and Xiang B (2019). Domain Adaptation with BERT-based Domain Classification and Data Selection. in Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019) 76–83 (Association for Computational Linguistics, 2019). https://doi.org/10.18653/v1/D19-6109.
Wang J et al. (2022) Generalizing to unseen domains: a survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering.
Devlin J, Chang M.-W., Lee K and Toutanova K (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. in NAACL-HLT (1) (eds. Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).
Lee J, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234–40.
Article Google Scholar
Beltagy I, Lo K and Cohan A (2019). SciBERT: A Pretrained Language Model for Scientific Text. in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 3615–3620 (2019).
Araci D (2019). FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. Preprint at http://arxiv.org/abs/1908.10063.
Lazaridou A, et al. Mind the gap: assessing temporal generalization in neural language models. Adv Neural Inf Process Syst. 2021;34:29348–63.
Google Scholar
Luu K, Khashabi D, Gururangan S, Mandyam K and Smith NA (2022). Time Waits for No One! Analysis and Challenges of Temporal Misalignment. in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 5944–5958 (Association for Computational Linguistics, 2022). https://doi.org/10.18653/v1/2022.naacl-main.435.
Röttger P and Pierrehumbert J (2021). Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. in Findings of the Association for Computational Linguistics: EMNLP 2021 2400–2412 (Association for Computational Linguistics, 2021). https://doi.org/10.18653/v1/2021.findings-emnlp.206.
Goel R. et al. (2016) The social dynamics of language change in online networks. in Social Informatics: 8th International Conference, SocInfo 2016, Bellevue, WA, USA, November 11–14, 2016, Proceedings, Part I 8 41–57 (Springer, 2016).
Huang X. and Paul M. J (2019). Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation Models. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 4113–4123 (Association for Computational Linguistics, 2019). https://doi.org/10.18653/v1/P19-1403.
Huang X. and Paul M. J (2018). Examining Temporality in Document Classification. in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 694–699 (Association for Computational Linguistics, 2018). https://doi.org/10.18653/v1/P18-2110.
Agarwal O, Nenkova A. Temporal effects on pre-trained models for language processing tasks. Transact Assoc Comput Ling. 2022;10:904–21.
Google Scholar
Rijhwani S and Preotiuc-Pietro D (2020). Temporally-Informed Analysis of Named Entity Recognition. in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 7605–7617 (Association for Computational Linguistics, 2020). https://doi.org/10.18653/v1/2020.acl-main.680.
Chen S, Neves L and Solorio T (2021). Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp. in Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media.
Lukes J & Søgaard, A. Sentiment analysis under temporal shift. in Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis 65–71 (Association for Computational Linguistics, 2018). https://doi.org/10.18653/v1/W18-6210.
Florio K, Basile V, Polignano M, Basile P, Patti V. Time of your hate: the challenge of time in hate speech detection on social media. Appl Sci. 2020;10:4180.
Article Google Scholar
Nemeskey DM (2021) Introducing huBERT. in XVII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY2021) 3–14.
Basile V (2020). Domain Adaptation for Text Classification with Weird Embeddings. in Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 (eds. Dell’Orletta, F., Monti, J. & Tamburini, F.) 37–43 (Accademia University Press, 2020). https://doi.org/10.4000/books.aaccademia.8250.
Zsibrita J, Vincze V, Farkas R. magyarlanc: a tool for morphological and dependency parsing of hungarian. Proc Int Conf Rec Adv Nat Lang Process RANLP. 2013;2013:763–71.
Google Scholar
Gozuacik N, Sakar CO, Ozcan S. Social media-based opinion retrieval for product analysis using multi-task deep neural networks. Expert Syst Appl. 2021;183: 115388.
Article Google Scholar
Fahfouh A, Riffi J, Mahraz M. A., Yahyaouy A. and Tairi H. (2022). A Contextual Relationship Model for Deceptive Opinion Spam Detection. IEEE Transactions on Neural Networks and Learning Systems.
Stefkovics Á, Krekó P, Koltai J. When reality knocks on the door. The effect of conspiracy beliefs on COVID-19 vaccine acceptance and the moderating role of experience with the virus. Soc Sci Med. 2024;356:117149.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

The research was supported by the European Union within the framework of the RRF-2.3.1-21-022-00004 Artificial Intelligence National Laboratory Program. The work of Zoltán Kmetty was supported by the Bolyai Scholarship, grant number: BO/834/22.

Author information

Authors and Affiliations

Hungarian Research Centre, Centre for Social Sciences, CSS-RECENS Research Group, Tóth Kálmán U. 4., 1097, Budapest, Hungary
Zoltán Kmetty, Bence Kollányi & Krisztián Boros
Faculty of Social Sciences, Sociology Department, Eötvös Loránd University, Budapest, Hungary
Zoltán Kmetty

Authors

Zoltán Kmetty
View author publications
You can also search for this author inPubMed Google Scholar
Bence Kollányi
View author publications
You can also search for this author inPubMed Google Scholar
Krisztián Boros
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conceptualization, Z.K. and B.K.; methodology, Z.K. K.B and B.K.; formal analysis, Z.K. K.B and B.K.; data curation, K.B.; writing—original draft preparation, Z.K. and B.K.; writing—review and editing, Z.K. and B.K; visualization, Z.K.; funding acquisition, Z.K.

Corresponding author

Correspondence to Zoltán Kmetty.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Ethics Approval

As we did not include human subject in our research no ethical approval was needed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 17 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kmetty, Z., Kollányi, B. & Boros, K. Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine. SN COMPUT. SCI. 6, 13 (2025). https://doi.org/10.1007/s42979-024-03553-2

Download citation

Received: 20 October 2023
Accepted: 19 November 2024
Published: 19 December 2024
DOI: https://doi.org/10.1007/s42979-024-03553-2

Keywords

Profiles

Zoltán Kmetty View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Boosting Classification Reliability of NLP Transformer Models in the Long Run—Challenges of Time in Opinion Prediction Regarding COVID-19 Vaccine

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-dimensional Classification on Social Media Data for Detailed Reporting with Large Language Models

Active Learning for Reducing Labeling Effort in Text Classification Tasks

Hierarchical Multi-label Classification of Online Vaccine Concerns

Explore related subjects

Data Availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Ethics Approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 17 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now