Next Article in Journal
Optimal Design of Photovoltaic Connected Energy Storage System Using Markov Chain Models
Next Article in Special Issue
Empty Summer: International Tourist Behavior in Spain during COVID-19
Previous Article in Journal
Densify and Expand: A Global Analysis of Recent Urban Growth
Previous Article in Special Issue
Should I Stay or Should I Go? Tourists’ COVID-19 Risk Perception and Vacation Behavior Shift
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic

1
Research Center of Contemporary Thinking and Innovation for Social Development, University of Huelva, 21017 Huelva, Spain
2
Department of Economy, University of Huelva, 21071 Huelva, Spain
*
Author to whom correspondence should be addressed.
Sustainability 2021, 13(7), 3836; https://doi.org/10.3390/su13073836
Submission received: 24 February 2021 / Revised: 22 March 2021 / Accepted: 23 March 2021 / Published: 31 March 2021

Abstract

:
This paper explores the role of social media in tourist sentiment analysis. To do this, it describes previous studies that have carried out tourist sentiment analysis using social media data, before analyzing changes in tourists’ sentiments and behaviors during the COVID-19 pandemic. In the case study, which focuses on Andalusia, the changes experienced by the tourism sector in the southern Spanish region as a result of the COVID-19 pandemic are assessed using the Andalusian Tourism Situation Survey (ECTA). This information is then compared with data obtained from a sentiment analysis based on the social network Twitter. On the basis of this comparative analysis, the paper concludes that it is possible to identify and classify tourists’ perceptions using sentiment analysis on a mass scale with the help of statistical software (RStudio and Knime). The sentiment analysis using Twitter data correlates with and is supplemented by information from the ECTA survey, with both analyses showing that tourists placed greater value on safety and preferred to travel individually to nearby, less crowded destinations since the pandemic began. Of the two analytical tools, sentiment analysis can be carried out on social media on a continuous basis and offers cost savings.

1. Introduction

The use of information and communications technology (ICT) in tourism destination management has become an essential strategy in ensuring the sustainability of these destinations, making them more competitive and facilitating their long-term survival [1]. Tourism evolves quickly and constantly, throwing up new challenges that must be addressed by implementing sustainable tourism models and new ways of doing business [2].
In today’s information society, tourists are more experienced, have greater access to information, and hold greater negotiating power through the use of the latest ICT [3]. As a result, competition is growing in the tourism sector, and effective use and management of ICT are very important for tourist destinations seeking to develop sustainable forms of tourism.
In this context, big data, the internet, and social media have changed the way we travel, influencing both the pre-travel phase, when we begin to think about travelling somewhere; the travel phase itself; and the post-travel phase, when we share our experiences [3].
These factors all help to influence decision-making among tourists as they plan their trips [3].
For those looking for leisure, entertainment, new destinations, and adventures, social media is the most common source of information for obtaining inspiration or opinions from other users [4].
These sources of information are also becoming key marketing resources for companies in the sector [5]. Marketing departments seek to exploit social media using natural language processing and machine learning techniques to segment campaigns, retain customers, and identify trends, among other activities. These techniques have proliferated as a result of easier access to the big data technologies used in most smart data analysis [6].
These changes make it especially important to analyse ICT management in general and social media in particular to facilitate sustainable development in tourist destinations.
Against this backdrop, this study seeks to demonstrate the potential contribution of social media in analyses of tourist behaviour and sentiments by tourist destinations. To do this, it uses social media to analyse changes in sentiments and behaviours among tourists in Andalusia (Spain). This overall objective is broken down into the following theoretical and empirical objectives.
The first theoretical objective is to analyse recent research on tourism and sentiment analysis using social media and to classify the main themes investigated. The second theoretical objective is to describe changes in tourist behaviour as a result of the COVID-19 pandemic according to research carried out on the subject.
Meanwhile, the empirical objective is to analyse changes in sentiments and behaviour among tourists visiting Andalusia in summer 2020 compared to summer 2019, using surveys carried out by the Andalusian regional authorities and comments made by tourists on Twitter.
Through these objectives, we aim to test the hypothesis that sentiments and behaviours among tourists visiting Andalusia changed as a result of the COVID-19 pandemic through surveys and sentiment analysis of the social network Twitter.
This analysis will demonstrate the potential for tourist destinations to use social media as a way of detecting tourist behaviours and sentiments and any short-term changes in them.
In order to fulfil these objectives, the study is divided into the following sections: following this introduction, the second section examines other studies which have used social media to perform tourist sentiment analysis and describes changes to tourism and tourists’ behaviours and sentiments in the COVID-19 era. The third section presents the case study, which focuses on analysing changes in tourist behaviour in Andalusia before and during the COVID-19 pandemic via statistical analysis and a sentiment analysis using Twitter data. The fourth and final section sets out the study’s main conclusions, identifies its limitations, and makes several recommendations for the public authorities.

2. Conceptual Framework

2.1. Use of Big Data in Tourism Destination Management

The tourism industry relies intensively on large amounts of information, and information and communications technology (ICT) is therefore of great importance to the sector from the perspective of both consumers and providers [7].
According to [8], there are three phases in the development of internet use in the tourism sector. The first phase took place in the 1990s, when the internet was used as a communication tool, and destination management organisations (DMOs) became information brokers. In the second phase, spanning the period from 2000–2010, the internet came to be used more for marketing than for communication. At this time, e-commerce was beginning to take off, and there was a demand for more personalised, aggregated experiences, giving rise to a new type of consumer [9]. Meanwhile, the third phase since 2010 has seen progress in areas such as search engines, social media, the internet of things, data analysis, and mobile technology. During this period, the concept of ‘smart tourism’ emerged to describe “the increasing reliance of tourism destinations, their industries, and their tourists on emerging forms of ICT that allow for massive amounts of data to be transformed into value propositions” [10].
Reference [11] reports the three levels of application for the concept of ‘smart tourism’ that correspond to tourist experience, business, and destination: smart experience, smart business ecosystem, and smart destination. At each level, big data is captured, exchanged, and processed.
The concept of the ‘smart tourism ecosystem’ is part of a systemic approach and refers to “tourist systems that take advantage of smart technology to create, manage, and deliver smart tourist services/experiences, which are characterised by an intense exchange of information and co-creation of value” [4,12].
The spread of information technologies throughout the travel cycle and the digital records derived from them offer a new and highly valuable source of data. This represents an opportunity in view of the sparse information available locally due to shortcomings in statistical systems for assessing movements without overnight stays [13].
Big data is the cornerstone of smart tourism destinations. Destination intelligence is powered by a smart information system allowing data to be collected, processed, and analysed to supply the necessary information at the appropriate time to the people who need the data to make informed decisions [14].
An innovative use of data offers added value by revealing connections that were previously undetectable, giving rise to a new debate around the nature of decision-making [11]. Data enable efficient management, greater transparency, and enhanced knowledge [4]. Smart data thus represent an extremely useful tool for boosting tourism competitiveness and sustainability [15]. Indeed, big data offers a more holistic, insightful overview of tourist activity, giving actors in the tourism industry the opportunity to streamline procedures, drive innovation, and deliver improved experiences [16].

2.2. Tourist Sentiment Analysis

The term ‘social media’ covers a variety of online platforms allowing users to create and share content and interact socially [17]. Several different categories of social media may be identified: social networks (Facebook and LinkedIn), blogs (Blogger and Wordpress), microblogs (Twitter and Tumblr), social news sites (Digg and Reddit), bookmarking sites (Delicious and StumbleUpon), shared media (Instagram and YouTube), question and answer sites (Yahoo! Answers and Ask.com (accessed on24 February 2021)), review sites (Yelp and TripAdvisor), and sites with mobile apps such as ‘Find My Friend’ [18].
Social media is currently one of the fastest growing marketing channels [10]. User-generated content (opinions, images, videos, etc.) and interactions between users (people, organisations and products) are the two types of information available on social media, offering large volumes of unstructured, dynamic content. This content can be analysed to generate knowledge.
According to data from TripAdvisor (2016), 77% of travellers check the comments left by former guests at the hotels they are thinking of staying at before booking [19]. Travellers have become the greatest influencers, as social media allows consumers to obtain first-hand information on the quality and prices of hotels at a click.
Sentiment analysis or opinion mining analyses people’s utterances, including opinions, feelings, evaluations, attitudes, emotions, and appraisals of products, services, organisations, individuals, subjects, events, and their attributes. The emergence and rapid growth of this field of study coincided with the boom in online social media; for the first time in history, a large volume of digitally recorded data and opinions is available [20].
With a summary of opinions, consumers can share their perceptions of certain products or experiences with potential tourists planning to purchase them. On the other hand, companies can identify the most popular and unpopular features of their products among consumers [19].
In short, consumers are no longer obliged to ask their relatives or neighbours if they are thinking about purchasing a product as they can obtain evaluations and reviews online and on social media [21]. At the same time, organisations and tourist destinations no longer need to conduct surveys or questionnaires, which take longer and require more resources [22].
However, the large number of websites and volumes of content generated demand the use of automated systems to collect and analyse the information available online [23]. Technorati estimates that 75,000 new blogs with 1.2 million posts are created each day, many of which share opinions on products and services; 60% of consumers in the USA have researched products online [24].
Digital media represents a kind of infostructure for the tourism industry, within which social media acts as a producer and distributor of active tourist information [25,26].
Most studies using mass data from social networks have focused on Twitter [27] due to the global nature of the platform and the fact that the data generated in the form of tweets are available for free in real time. Each geolocalised tweet leaves a digital footprint of the time and place when it was sent [28]. If the data are processed by user name, it is possible to draw up a space–time profile for each user showing the places they have visited at different times. Social media activity can thus be used to analyse changing population densities in a city throughout the day [29], as well as mobility patterns among the population [30].
It is also possible to use geolocalised tweets to analyse the degree of social mixing in the use of space, tracking the movements of social groups in highly segregated cities such as Rio de Janeiro [31,32].
Unlike the information supplied by official sources offering data by place of residence, the indicators of multiculturalism and mixing analysed in these studies using big data refer to the use of space throughout the day. For example, studies have examined linguistic diversity in cities and regions based on the languages used in tweets as an indicator of cultural diversity [33]. In the field of tourism studies, very few studies have used geolocalised tweets; those that have focus on comparing tourists’ spatial behaviour at the national or global scale [34,35,36], but not at the intraurban scale.
Other studies such as [37] analyse the way in which potential tourists used social media to make travel decisions during the Zika pandemic in the context of widespread disinformation where the authorities failed to provide sufficient information regarding tourism.
The work of [38] uses an automated process to analyse the cognitive, affective, and conative components of perceptions of the Basque Country as a tourist destination based on posts in the travel community www.minube.com, accessed on 24 February 2021. It concludes that the region’s natural and cultural resources have the greatest influence on its image as a tourist destination [39].
Reference [40] adopted a methodology in which information was automatically downloaded and processed, and the content of 85,000 reviews by tourists who visited Catalonia between 2004 and 2013 from four different travel websites (TripAdvisor, TravelBlog, VirtualTourist and TravelPod) was analysed [41]. The authors used a combination of online resources and open access software, concluding that this methodology can be used for different locations, languages, and topics. The study provides relevant information for destination management offices, allowing them to identify their brands’ positioning through sentiments and opinions posted by tourists on travel blogs [42].
Reference [43] tested the Destination Management Information System (DMIS) in Åre, Sweden, applying a business intelligence approach to organisational learning in tourist destinations. The system provides real-time information about indicators in three different areas: economic performance, with data on occupancy, price, stays, bookings, and sales; consumer behaviour, with data on consumer profiles, web browsing, and the purchase process; and brand management, analysing loyalty, value, satisfaction, and brand awareness [44].
A gradual evolution may be observed in the studies of tourist destinations based on content analysis and social media carried out by [45,46,47,48,49], which, although they fulfil their objective of analysing perceptions of the different components of a tourist destination’s image, remain rather “homespun” [50] in terms of the methods used to capture, clean, and process data. The work of [51,52,53] is qualitatively different, using automated processes to extract, clean, process, and analyse data.
In order to expand upon and update this body of literature, a bibliographic analysis was carried out on conceptual and empirical studies analysing tourist behaviour via social media published in the last two years (2019–2020) on the Web of Science, with a view to identify their contributions and analytical procedures [54]. The studies covered by this bibliographic analysis are summarised in Table 1.
Table 1 shows that most recent studies performing sentiment analysis using social media in the field of tourism studies have focused on the social networks Facebook, TripAdvisor, and Twitter.
The most relevant themes identified in these studies were sentiment analysis, identification of tourist sites based on digital impressions using social media, tourist preferences harvested from social media, social media communication strategies, use of geographical labels, web platforms as a communication tool, tourism recommendation systems, cultural exposure to a foreign city through the media in particular, definition of smart tourism, and current trends.
Another theme emerging from the studies analysed was the need to detect messages with the greatest influence on purchase behaviours and contradictory messages, as well as to conduct comparative analysis of different methodologies to observe the existence of contradictory messages when different analytical methods were applied.
On the other hand, as Table 1 shows, many studies on sentiment analysis in the tourism sector are primarily methodological, and their objectives focus on data processing using different techniques, frameworks, methods, etc., for the following purposes: to detect contradictory messages, to classify different types of messages, to conduct research in a uniform, open manner, and to create a methodology for classifying the positive or negative impact of tourists’ opinions on other tourists’ decision-making.
Analysis of tourists’ sentiments and opinions to identify the characteristics of destinations, resources, services, etc., that are most important to them and enable improvements to tourism management is another, less studied theme.
This study will therefore focus on the latter, aiming to analyse the behaviours and sentiments of tourists travelling in Andalusia using the social network Twitter and to identify differences between 2019 and 2020 due to the COVID-19 pandemic. To measure these emotions, machine learning algorithms will be used to automatically extract the sentiments expressed by tourists.
However, before moving on to the sentiment analysis, the changes in the performance of the tourism sector in general and in tourist demand during the COVID-19 pandemic in particular will be described, as well as the impact of the pandemic on tourism in Andalusia.

2.3. Tourist Behaviour during the COVID-19 Era

According to the World Tourism Organisation (UNWTO), the COVID-19 pandemic has had a huge impact on the global economy, with tourism among the worst-affected sectors. The travel statistics survey showed a reduction of 49.6% in March 2020 compared to the same month the previous year. The total number of foreign visitors declined particularly dramatically, with an 85.9% drop in inbound tourism [2].
It is hoped that the tourism sector can recover and overcome these challenges, adopting tourism development strategies that encompass economic, social, and environmental aspects and encourage more sustainable tourism in the future. It is also important to understand the positive environmental impact caused by the pandemic in a relatively short space of time [2].
As previous studies have shown, tourist demand is highly sensitive to any type of risk [43]. Faced with even a minor risk, potential tourists change destination or modify their travel plans [55].
The SARS virus that emerged in China in 2002–2003 and quickly spread around the world [56] led to warnings not to travel to certain Asian countries on health grounds. This led to the loss of thousands of jobs in the Asian tourism sector [57]. A number of studies have analysed the impact of the epidemic on the sector [58].
The H1N1 bird flu that broke out in 2009 had a significant impact on international tourism in 2010. Tourist demand decreased across all continents, with the exception of Africa and South America. Lee, Y. et al. showed that tourism declined the most in the five countries whose governments adopted the most restrictive measures to stop the virus from spreading, including quarantining patients, closing schools, cancelling public events, and controlling international borders.
Niewiadomski, P. analysed the role of the tourism industry in response to the SARS (2003) and H1N1 (2009) health crises. The MERS-CoV virus that emerged in Saudi Arabia in 2012 also had an impact on the tourism sector. The disease was particularly widespread in South Korea, which experienced a dramatic drop in tourist demand [59], especially from China, the main country of origin of tourists to South Korea [60].
The studies cited here all analysed the impact of pandemics on the tourism industry. However, studies on crisis communication management are few and far between [61]. Some studies have analysed crisis communication by public institutions and governmental bodies during health crises, highlighting the best practices adopted in these cases, although they focus on communication relating to pandemics and public health rather than the tourism sector. Ritchie, Brent W. studied the British tourist board’s crisis communication management following the 2001 health crisis.
With regard to the tourism crisis caused by the COVID-19 pandemic, there appears to be no doubt that tourists will return, as holidays are an essential expenditure for many families. The tourists travelling after the pandemic will no longer be the same, however. According to a survey carried out by Ernesto C., et al. 80% of people surveyed in April last year expressed a desire to travel. The main criteria in the choice of destination were low numbers of people, the characteristics of the destination, and the public health measures in place. Price was the fourth most important consideration. Despite this, none of the respondents said that they would travel with organised groups, and 77% said that they would travel within Spain [62].
The July 2020 report ‘Tourism After COVID-19: Reflections, Challenges and Opportunities’ suggests a change in tourists’ behaviour following the COVID-19 pandemic: demand for less crowded destinations will grow, people will travel individually rather than in groups, demand for tourist products allowing flexible cancellation will rise, demand for hygiene and social distancing measures will grow, demand for better travel insurance covering pandemics will increase, people will eat in their accommodation instead of going to restaurants, and demand for outdoor activities will rise.
In short, an analysis of studies and reports on tourism during the COVID-19 pandemic reveals that tourists’ habits, behaviours, and sentiments have changed substantially, and many of these changes will persist into the future. This may have a significant impact on the restructuring of the sector.

3. Case Study: Tourist Sentiment Analysis in Andalusia

3.1. Tourist Behaviour in Andalusia before and during the COVID-19 Pandemic According to Survey Data

According to data from the Andalusian Tourism Situation Survey [63], tourism declined by 47.5% in the third quarter of 2020 compared to the same period in 2019, falling from 11,425,437 tourists to 6,000,293.
With regard to the origin of these tourists, 45.3% were from Andalusia and 42.0% were from elsewhere in Spain. Domestic tourism thus accounted for 86% of tourism in the region, whereas this figure was only 64% prior to the COVID-19 pandemic, confirming the observation that tourists prefer to visit nearby destinations made in this study. The Andalusian provinces that suffered the lowest decline in tourism during the same quarter, below the average for the region, depend to a greater extent on domestic tourism and were less crowded: Jaén (−16.2%), Cádiz (−30.5%), Huelva (−34.0%), and Almería (44.8%). The cities of Granada and Córdoba maintained their market share from previous years, falling to around the average for Andalusia.
With regard to the tourists’ ages, the pattern was as expected, with the greatest decline in travel this quarter observed among people aged older than 65. This was followed by those aged 30–44 years and those younger than 18 years, pointing to a substantial, above-average decline in trips taken by couples with underage children.
According to [64,65], the main reason for travelling remained holidays and leisure, which was cited by around 90% of tourists, while the prevalence of visits to family and friends rose to surpass other reasons.
In terms of accommodation, stays in hotels and apartment hotels fell below the average in comparison with the third quarter of 2019, followed by stays at friends’ and relatives’ homes, hostels, guest houses, and bed and breakfasts. To a lesser extent, stays in rented apartments, second homes, and campsites also declined.
The average stay in Andalusia dropped from 10.1 days to 9.4 days (−7%). The average daily expenditure also decreased by 7%, with expenditure rising among domestic tourists and falling among international tourists.
Finally, the qualitative scores from 0 to 10 assigned by tourists to different aspects of their experience (accommodation, food, leisure, transport, safety, service, cleanliness, etc.) stayed around the same as in the third quarter of 2019. Only the following aspects received a lower score: public transport by bus (2%), public safety (2%), and public transport by train (1%). The remaining aspects received a higher score, including transport by taxi (7%), quality of beaches (6%) and natural parks (4%), car hire services (4%), cleanliness (4%), golfing facilities (4%), and ports and nautical activities (4%).
The impact of the COVID-19 pandemic is clear in these scores. While services perceived as more unsafe (public transport) scored the lowest, individual transport, cleaning services, and certain facilities that make tourists feel safer (beaches and protected natural areas) increased their scores.
The tourist behaviour identified in this analysis of the data from the Andalusian Tourism Situation Survey is perfectly aligned with the characteristics and changes in tourist demand during the COVID-19 pandemic identified in the bibliographic analysis. These characteristics include a steep decline in tourist activity, with trips to nearby, less crowded, safer destinations using private transport and accommodation preferred over shared accommodation. In addition, tourists give a higher score to aspects relating to safety, such as cleanliness, the natural environment, and certain facilities, than to riskier, less safe aspects of the tourist experience such as public transport.

3.2. Tourist Behaviour and Opinions in Andalusia According to a Twitter Sentiment Analysis

Methodological Approach

This study is based on an exploratory analysis using the statistical programming language Rstudio and the library (rtweet) package for extracting tweets. Machine learning sentiment analysis algorithms were then applied to the resulting data. This is a type of artificial intelligence that trains a virtual machine via data mining to automate data analysis procedures. Among other features, it allows tweets to be classified into positive, negative, and neutral, as shown in the results process.
The social network selected for this study was Twitter, due to its capacity to reach a wide audience and its anonymous nature, which have led to exponential growth on a global scale and have transformed the platform into an alternative source of information alongside more traditional media.
The user accounts used for the sentiment analysis were geolocated in Málaga, which is the main hub for tourists in Andalusia. Accounts within a 500 km radius of Málaga were included in an attempt to cover the whole region.
To extract the data, we connected to Twitter’s open API (Application Programming Interface), allowing us to develop applications to take advantage of the information available online. In this way, it was possible to perform a search on Twitter and compile all the messages linked to certain terms, which acted as filters. These terms were ‘my trip’, ‘my experience’, ‘my holidays’, ‘as a tourist’, ‘as a visitor’, and ‘as a traveller’, using the R programming language to extract the data. Table 2 details the process:
The tweets that were extracted contained comments and interactions by users from Andalusia about tourism in Andalusia. The data collection process was divided into two phases: (a) phase 1, in August, September, October, and November 2019 and (b) phase 2 in August, September, October, and November 2020.
A descriptive analysis of the tweets collected in 2019 and 2020 was then carried out. Once the data had been cleaned and filtered, they were processed using the statistical software Knime. This data mining platform facilitates the tasks of data analysis, modelling, processing, and visualisation.
On the Knime platform, modelling is carried out in process blocks which can be executed separately to reduce processing time. The model is presented in Figure 1 and contains the following phases: (a) the xls file is read; (b) in the Document Creation block, the file is converted to text; (c) the column with which it will be evaluated is selected; (d) the Text Preprocessing block cleans the data before they are divided by relevant words and classified. Once the data have been processed, they are assigned a colour for the Analyse Network phase in which the data matrix is divided into training data and testing data. In Knime, the algorithm consists of the learner and the predictor; once the data have been processed, a block to plot the data and another to display the results must be added.
The database obtained allows for quick classification using filters for the following variables: user name, message (tweet), date and time, latitude, longitude, favourite, retweeted, and retweeted from.
Figure 2 shows data preprocessing on the platform, when punctuation marks and numbers are removed and all letters are in lowercase. The connector words, uploaded in a list in advance, are then removed. Finally, the tokeniser process is carried out. It is important to note that the tool occupies several process blocks to complete this action.

4. Materials and Methods

Results of Analysis

Once the methodological process of data collection, extraction, filtering, and cleaning was complete, a sentiment analysis was performed, and the average veracity was obtained using Knime. This task consisted of assigning an overall polarity to the tweets on a scale of three levels of intensity: negative, neutral, and positive (Neg, Neu, and Pos). The first set of data analysed contains 11,532 tweets from 2019, among which 21% were classified as neutral, 73% as positive, and 6% as negative. This binary classification of tourists’ opinions is shown in Table 3.
The table above shows the analysis of the first set of data. From a positive starting point, the sentiment varies on the basis of the tourist’s experiences, emerging news stories and other factors, predicting future sentiment among tourists. The scores indicate the polarity of the sentiment. These classifications allow us to observe that a high and low sentiment leads to a sentiment of 4 or 5, whereas a moderately low sentiment but very negative high sentiment leads to a final sentiment of −4 or −5. Cases in which the final sentiment is 5 (positive) tend to be characterised by moderately high and low sentiment over time and a volume of positive news.
Figure 3 shows that the words most commonly used by tourists on Twitter include ‘Málaga’ and ‘Benidorm’ as destinations, followed by ‘Spain’. ‘Beach’ and ‘holidays’ are mentioned as activities, with comments revolving around ‘sun’, ‘people’, ‘sea’, and ‘sand’.
Table 4 shows the results of the analysis of tourist sentiment expressed from July to October 2020 based on the second set of data. This set contains 14,000 tweets, among which 12% were classified as neutral, 30% as positive, and 58% as negative.
As the Table 4 shows, a greater number of negative sentiments were identified during the COVID-19 era. A word cloud indicating the words most commonly used during the third quarter of 2020 is shown in Figure 4 below.
The words most commonly used by tourists on Twitter during the pandemic were identified, with ‘Granada’ appearing as a new destination alongside new words relating to different types of activities such as ‘culture’, ‘city’, and ‘mountain’. Other words appearing in the tourists’ tweets included ‘travel’, ‘experiences’, and ‘tourism’, as well as indications of change. These are the most common themes across all sentiments expressed by tourists on the social network.
Table 5 shows tourists comment on ‘sun’, ‘beach’, and, to a lesser extent, ‘cultural tourism’. These results were as expected for a beach destination in summer. The destinations with the most classified comments were ‘Cádiz’, ‘Huelva’, ‘Málaga’, ‘Almería’, and ‘Almuñécar’.
During the pandemic, the most commonly used expressions revolved around culture and the adaptation of tourism and leisure to the new conditions imposed by COVID-19. These expressions are shown in Table 6.
The following matrix (Table 7) shows the most popular words in 2019 and the relationship between them. The words listed appear at least ten times in the tweets.
Figure 5 shows the words most used by the tourists in their tweets: ‘beach’, ‘sand’, ‘summer’, ‘holiday’, ‘people’, ‘Cádiz’, ‘Málaga’, ‘Huelva’, and ‘Almería’.
The most popular words appearing at least ten times in the tweets and the relationship between them were also identified for 2020. They included: ‘culture’, ‘Seville’, ‘city’, ‘Mezquita’, ‘Málaga’, ‘Córdoba’, and ‘Granada’. This is shown in Table 8 and Figure 6.
The comparative analysis of the sentiments displayed by travellers in Andalusia based on tweets from the two periods under study shows that negative sentiments became more prevalent during the COVID-19 pandemic, as did references to inland and city destinations (Córdoba, Granada, Málaga, and Seville). During the previous period, there were more references to coastal destinations (Cádiz, Almería, Málaga, and Huelva), especially long-standing, mass tourism destinations such as Málaga.
Other words that were more common prior to COVID-19 were ‘beach’ and ‘sand’, confirming that beach holidays were predominant during that quarter in 2019. Meanwhile, the words appearing most often during the COVID-19 period were ‘cultural’ and ‘city’, demonstrating the need for the tourism sector to adapt to the health crisis and to the rise in travel to less crowded, inland, and city destinations on both the demand and supply side.
Analysis of tourists’ behaviours and sentiments using social media data provides similar, complementary information to the survey data.
Both analytical tools highlight the increased importance of cultural tourism and visits to less crowded destinations, including cities (Granada, Málaga, and Córdoba) and inland destinations (mountains, natural parks), during the COVID-19 period in comparison with the previous summer. In summer 2020, there were far fewer references to ‘sand’, ‘sun, and ‘beach’. Words such as ‘Benidorm’, ‘Spain’, ‘pool’, ‘beach’, ‘sun’, ‘sand’, etc., which are associated with mass beach tourism, were largely absent in 2020.
On the other hand, the sentiment analysis indicates that tourists travelling in Andalusia experienced a more negative sentiment in summer 2020 than in the previous year. This is apparent from the survey, with lower scores for certain aspects of the tourist experience (public transport and tourist accommodation especially) and from some of the ten most widely cited comments, which highlight the experience of visiting less crowded destinations.

5. Conclusions

In terms of the first theoretical objective, this paper describes the recent rise in studies performing sentiment analysis using social media data in relation to the tourism sector. The majority of the articles published on the topic in the Web of Science in the last two years have focused on methodology, although studies classifying tourists’ opinions of certain resources, destinations or experiences are also common. This study falls into the second category.
With regard to the second theoretical objective, this study has shown that COVID-19 has seriously affected international tourism in terms of numbers of trips and tourist behaviour and sentiments, with a greater impact than other health crises. Tourists have begun to demand safer, healthier destinations, and it is likely that these changes will persist in the future.
As for the empirical objectives, a case study has been used to show how sentiment analysis can be used to supplement or even replace surveys as a tool for analysing tourists’ behaviour and opinions. This type of analysis offers cost savings and reveals tourists’ behaviour and opinions on a continuous basis in real time, as shown by authors such as [22].
In the Andalusian case study, both the surveys and the sentiment analysis using Twitter data show how COVID-19 has changed the behaviour and sentiments of tourists travelling in the region. By combining the data from both sources, it is apparent that tourism declined in Andalusia in summer 2020, especially in crowded beach destinations.
This analysis demonstrates the need for public and private stakeholders in the Andalusian tourism sector to promote the region as a safe destination and to implement strategies and measures to enhance safety, such as monitoring visitor flows to certain locations, maintaining social distancing, and cleaning facilities and infrastructures, etc. These measures should be more visible in mass tourism destinations, as they are considered less safe by visitors and are at greater risk of declining numbers of arrivals during the pandemic.
An analysis of the data from the two tools offers complementary information. Whereas the survey provides information on behavioural changes in relation to quantitative variables (visitor numbers, average stays, average expenditure, most visited provinces and destinations, etc.), the sentiment analysis reveals subjective utterances and emotions and classifies them as positive, negative or neutral. It also allows us to analyse changes in these opinions on a continuous basis over time at a low cost.
Indeed, this study demonstrates the value of a simple exploratory data analysis in obtaining important information on potential causes of problems, changes in demand, etc. Data visualisation using algorithms, tables, word clouds, and simple graphs is a key element of this exploratory analysis, allowing us to detect possible changes in tourists’ behaviour, opinions, and sentiments. Moreover, this type of analysis is inexpensive and affordable for small-scale tourist destinations due to the availability of free software, such as RStudio and Knime, which makes data analysis and graphic representation accessible to any individual or organisation.
This article aims to contribute to the gap in the literature on the use of social media by tourist destinations to analyse tourist behaviours and sentiments. This area of study has huge potential for growth in the coming years, given the significant progress made in the use of big data and social media [66,67].
The study also contributes to a greater understanding of changes in behaviours and sentiments among tourists visiting Andalusia as a result of the COVID-19 pandemic, using a combination of survey data and analysis of comments by tourists on the social network Twitter.
With regard to the limitations of the study, opinions were only analysed using one social network: Twitter. Although this is one of the most widely used social networks, it would be interesting to compare the data with other social networks such as Facebook or Instagram. This is a potential area for further research, which could be extended by comparing information and content on social media with news from traditional media outlets to observe differences in the handling of themes, coverage, and other aspects.

Author Contributions

Conceptualization, D.F.-R. and A.E.-S.; methodology, D.F.-R. and A.E.-S.; software, A.E.-S.; validation, D.F.-R., A.E.-S. and M.d.l.O.B.-G.; formal analysis, D.F.-R., A.E.-S.; investigation, A.E.-S.; resources, A.E.-S.; data curation, A.E.-S.; writing—original draft preparation, D.F.-R. and A.E.-S.; writing—review and editing, D.F.-R., A.E.-S. and M.d.l.O.B.-G.; visualization, D.F.-R. and A.E.-S.; supervision, D.F.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not apllicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Roman, M.; Niedziółka, A.; Krasnodębski, A. Respondents’ Involvement in Tourist Activities at the Time of the COVID-19 Pandemic. Sustainability 2020, 12, 9610. [Google Scholar] [CrossRef]
  2. Kitamura, Y.; Karkour, S.; Ichisugi, Y.; Itsubo, N. Evaluation of the Economic, Environmental, and Social Impacts of the COVID-19 Pandemic on the Japanese Tourism Industry. Sustainability 2020, 12, 302. [Google Scholar] [CrossRef]
  3. Valdivia, P.A.; Arteaga, P.L.; Escortel, M.E.; Monge, C.S.; Villares, R.J. Analysis of complaints in primary care using statistical process control. Rev. Calid. Asist. 2009, 24, 155–161. [Google Scholar]
  4. Abascal, M.R.; Lopez, O.E.; Zepeda, H.S. Análisis cualitativo para la detección de factores que afectan el rendimiento escolar: Estudio de caso de la licenciatura en tecnologías y sistemas de información. Pist. Educ. 2016, 38, 252–267. [Google Scholar]
  5. Martín, A.C.; Aguilar, R.M.; Torres, J.M.; Diaz, S. Supervisión remota en el entrenamiento de un clasificador de sentimientos en comentarios turísticos. In Proceedings of the XXXIX Jornadas de Automática, Área de Ingeniería de Sistemas y Automática, Universidad de Extremadura, Badajoz, Spain, 5–6 September 2018; pp. 644–650. [Google Scholar]
  6. Rodriguez, G.P.; Molina, M.M. La segmentación de la demanda turística Española. Metodol. Encuestas 2007, 9, 57–92. [Google Scholar]
  7. Badar, N.A. Stock markets’ reaction to COVID-19: Cases or fatalities? Res. Int. Bus. Financ. 2020, 54, 101249. [Google Scholar]
  8. Zheng, X.; Daniel, R.F. Big Data Analytics, Tourism Design and Smart Tourism. In Analytics in Smart Tourism Design; Springer: Cham, Switzerland, 2016; pp. 299–307. [Google Scholar]
  9. Güçlü, B.; Roche, D.; Marimon, F. City Characteristics That Attract Airbnb Travellers: Evidence from Europe. Int. J. Qual. Res. 2020, 14, 271–290. [Google Scholar] [CrossRef]
  10. Tim, F.; Garth, G. Shingled magnetic recording: Areal density increase requires new data management. Login Mag. USENIX SAGE 2013, 38, 22–30. [Google Scholar]
  11. Zhang, D.; Hu, M.; Ji, Q. Financial markets under the global pandemic of COVID-19. Financ. Res. Lett. 2020, 36, 101528. [Google Scholar] [CrossRef]
  12. Valdivia, A.; Hrabova, E.; Chaturvedi, I.; Luzón, M.V.; Troiano, L.; Cambria, E.; Herrera, F. Inconsistencies on TripAdvisor reviews: A unified index between users and Sentiment Analysis Methods. Neurocomputing 2019, 353, 3–16. [Google Scholar] [CrossRef]
  13. Aguilar, G.N.; Romero, G.L.; Martinez, G.E.; Garcia, S.E.; Aguilar, A.J. Dataset on Dynamics of Coronavirus on Twitter; Elsevier: Amsterdam, The Netherlands, 2020; pp. 1–14. [Google Scholar]
  14. Gonzalez, B.B. Evaluando Twitter como indicador de opinión pública: Una mirada al arribo de Bachelet a la presidencial chilena 2013. Rev. SAAP Publicación Cienc. Política Soc. Argent. Análisis Político 2015, 9, 119–141. [Google Scholar]
  15. Ivars, B.J.; Solsona, F.J.M.; Giner, S.D. Gestión turística y tecnologías de la información y la comunicación (TIC): La nueva perspectiva de los destinos inteligentes. Doc. Anal. Geogr. 2016, 62, 327–346. [Google Scholar]
  16. Niewiadomski, P. COVID-19: From temporary de-globalisation to a rediscovery? Tour. Geogr. 2020, 22, 651–656. [Google Scholar] [CrossRef]
  17. Gerbasi, A.; Porath, C.L.; Parker, A.; Spreitzer, G.; Cross, R. Destructive de-energizing relationships: How thriving buffers their effect on performance. J. Appl. Psychol. 2015, 100, 1423–1433. [Google Scholar] [CrossRef]
  18. Martinez, T.M.; Toral, S. A machine learning approach for the identification of the deceptive reviews in the hospitality sector using unique attributes and sentiment orientation. Tour. Manag. 2019, 75, 393–403. [Google Scholar] [CrossRef]
  19. Criado, J.I.; Pastor, V.; Villodre, J. Big Data y Administraciones Públicas en Redes Sociales. Colección Novagob 2018, 3, 1–30. [Google Scholar]
  20. Cotelo, J.M.; Cruz, F.; Ortega, F.J.; Troyano, J.A. Explorando Twitter mediante la integración de información estructurada y no estructurada. Proces. Leng. Nat. 2015, 55, 75–82. [Google Scholar]
  21. Fondevila, G.J.; Liberal, O.S.; Gutierrez, A.Ó. Semantic analysis in social media. Comun. Rev. Recer. d’Anàlisi 2019, 36, 71–94. [Google Scholar]
  22. Frias, M.V.; Soto, V.; Hohwald, H.; Frias, M.E. Characterizing urban landscapes using geolocated tweets. In Proceedings of the 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, Amsterdam, The Netherlands, 3–5 September 2012; pp. 1–10. [Google Scholar] [CrossRef]
  23. Martin, C.A.; Torres, J.; Aguilar, R.M.; Dias, S.D. Using Deep Learning to Predict Sentiments: Case Study in Tourism. Hindawi 2018, 2018, 7408431. [Google Scholar] [CrossRef] [Green Version]
  24. Won, S.S.; Kawahara, M.; Ahn, C.W.; Lee, J.; Lee, J.; Jeong, C.; Kingon, A.I.; Seung, H.K. Lead-Free Bi0.5(Na0.78K0.22)TiO3 Nanoparticle Filler–Elastomeric Composite Films for Paper-Based Flexible Power Generators. Adv. Electron. Mater. 2020, 6, 1900950. [Google Scholar] [CrossRef]
  25. Femenia, S.F.; Neuhofer, B. Smart tourism experiences: Conceptualisation, key dimensions and research agenda. Investig. Reg. J. Reg. Res. 2018, 42, 129–150. [Google Scholar]
  26. Fondevila, G.J.; Rom, R.J.; Santana, L.E. Comparativa internacional del uso de recursos digitales en el periodismo digital deportivo: Estudio de caso de España y Francia. Rev. Lat. Comun. Soc. 2016, 71, 124–140. [Google Scholar]
  27. Murthy, S.B.; Karanth, S.; Shah, S.; Shastri, A.; Rao CP, V.; Bershad, E.M.; Suarez, J.I. Thrombolysis for Acute Ischemic Stroke in Patients With Cancer A Population Study. Stroke 2013, 44, 3573–3576. [Google Scholar] [CrossRef] [Green Version]
  28. Moreno, O.A.; Salles, S.B.; Orrequia, B.A. Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector. Inf. Technol. Tour. 2019, 21, 535–557. [Google Scholar] [CrossRef]
  29. Ciuccarelli, M.C. Etruscan Tombs in a “Roman” City: The Necropolis of Caere between the Late Fourth and the First Century B.C.E. Etruscan Ital. Stud. 2015, 18, 200–210. [Google Scholar] [CrossRef]
  30. Zheng, Y.; Goh, E.; Wen, J. The effects of misleading media reports about COVID-19 on Chinese tourists’ mental health: A perspective article. Int. J. Tour. Hosp. Res. 2020, 31, 337–340. [Google Scholar] [CrossRef] [Green Version]
  31. Netto, G.; Bhopal, R.; Lederle, N.; Khatoon, J.; Jackson, A. How can health promotion interventions be adapted for minority ethnic communities? Five principles for guiding the development of behavioural interventions. Health Promot. Int. 2010, 25, 248–257. [Google Scholar] [CrossRef]
  32. Shelton, T.; Poorthuis, A.; Zook, M. Social media and the city: Rethinking urban socio-spatial inequalityusing user-generated geographic information. Landsc. Urban Plan. 2015, 142, 198–211. [Google Scholar] [CrossRef]
  33. Mocanu, D.; Baronchelli, A.; Perra, N.; Gonçalves, B.; Zhang, Q.; Vespignani, A. The Twitter of Babel: Mapping World Languages through Microblogging Platforms. PLoS ONE 2013, 8, e61981. [Google Scholar] [CrossRef]
  34. Bassols, N.M.; Castelló, J.V. Effects of the great recession on drugs consumption in Spain. Econ. Hum. Biol. 2016, 22, 103–116. [Google Scholar] [CrossRef]
  35. Hawelka, D.; Stollenwerk, J.; Pirch, N.; Wissenbach, K.; Loosen, P. Improving surface properties by laser-based drying, gelation, and densification of printed sol–gel coatings. J. Coat. Technol. Res. 2014, 11, 3–10. [Google Scholar] [CrossRef]
  36. Sobolevsky, S.; Bojic, I.; Belyi, A.; Sitko, I.; Hawelka, B.; Arias, J.M.; Ratti, C. Scaling of city attractiveness for foreign visitors through big data of human economical and social media activity. In Proceedings of the 2015 IEEE International Congress on Big Data, New York, NY, USA, 27 June–2 July 2015; pp. 600–607. [Google Scholar] [CrossRef] [Green Version]
  37. Gui, Z.; Saravanamurugan, S.; Cao, W.; Schill, L.; Chen, L.; Qi, Z.; Riisager, A. Highly Selective Aerobic Oxidation of 5-HydroxymethylFurfural into 2,5-Diformylfuran over Mn–Co Binary Oxides. ChemistrySelect 2017, 2, 6632–6639. [Google Scholar] [CrossRef] [Green Version]
  38. Serna, M.Á.; Sreenan, C.J.; Fedor, S. A Visual Programming Framework for Wireless Sensor Networks in Smart Home Applications. In Proceedings of the 2015 IEEE Tenth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Singapore, 7–9 April 2015; pp. 1–6. [Google Scholar] [CrossRef] [Green Version]
  39. Valdivia, A.; Martinez-Camara, E.; Chaturvedi, I.; Luzón, M.V.; Cambria, E.; Ong, Y.S.; Herrera, F. What do people think about this monument? Understanding negative reviews via deep learning, clustering and descriptive rules. J. Ambient Intell. Humaniz. Comput. 2020, 11, 39–52. [Google Scholar] [CrossRef]
  40. Marine-Roig, E.; Clavé, S.A. Tourism analytics with massive user-generated content: A case studyof Barcelona. J. Destin. Mark. Manag. 2015, 4, 162–172. [Google Scholar] [CrossRef]
  41. De Núñez, X.; Maria, X.; Núñez-Valdez, E.R.; Pascual Espada, J.; González-Crespo, R.; Garcia-Díaz, V. A proposal for sentiment analysis on twitter for tourism-based applications. In Proceedings of the 17th International Conference on New Trends in Intelligent Software Methodology Tools and Techniques, SoMeT 2018, Granada, Spain, 26 September 2018. [Google Scholar]
  42. Del Mar Gálvez-Rodríguez, M.; Alonso-Cañadas, J.; Haro-de-Rosario, A.; Caba-Pérez, C. Exploring best practices for online engagement via Facebook with local destination management organisations (DMOs) in Europe: A longitudinal analysis. Tour. Manag. Perspect. 2020, 34, 100636. [Google Scholar] [CrossRef]
  43. Fuchs, M.; Höpken, W.; Lexhagen, M. Big data analytics for knowledge generation in tourismdestinations–A case from Sweden. J. Destin. Mark. Manag. 2014, 3, 198–209. [Google Scholar]
  44. Murga, J.; Zapata, G.; Chavez, H.; Raymundo, C.; Rivera, L.; Domínguez, F.; Moguerza, J.M.; Alvarez, J.M. A Sentiment Analysis Software Framework for the Support of Business Information Architecture in the Tourist Sector. J. Subline 2020, 12390, 199–219. [Google Scholar]
  45. Pan, Y.; Liu, L.; Zhao, H. Recyclable flame retardant paper made from layer-by-layer assembly of zinc coordinated multi-layered coatings. Cellulose 2018, 25, 5309–5321. [Google Scholar] [CrossRef]
  46. Stepchenkova, S.; Kirilenko, A.P.; Morrison, A.M. Facilitating Content Analysis in Tourism Research. J. Travel Res. 2009, 47, 454–469. [Google Scholar] [CrossRef] [Green Version]
  47. Fedele, L.; Colla, L.; Bobbo, S. Viscosity and thermal conductivity measurements of water-based nanofluids containing titanium oxide nanoparticles. Int. J. Refrig. 2012, 35, 1359–1366. [Google Scholar] [CrossRef]
  48. Dickinger, A.; Stangl, B. Website performance and behavioral consequences: A formative measurement approach. J. Bus. Res. 2013, 66, 771–777. [Google Scholar] [CrossRef]
  49. Rossetti, L.; Digiuni, M.; Montesano, G.; Centofanti, M.; Fea, A.M.; Iester, M.; Frezzotti, P.; Figus, M.; Ferreras, A.; Oddone, F.; et al. Blindness and Glaucoma: A Multicenter Data Review from 7 Academic Eye Clinics. PLoS ONE 2015, 10, e0136632. [Google Scholar] [CrossRef] [PubMed]
  50. Lu, W.; Stepchenkova, S. User-Generated Content as a Research Mode in Tourism and Hospitality Applications: Topics, Methods, and Software. J. Hosp. Mark. Manag. 2015, 24, 119–154. [Google Scholar] [CrossRef]
  51. McKercher, B. A case for ranking tourism journals. Tour. Manag. 2005, 26, 649–651. [Google Scholar] [CrossRef]
  52. Rodríguez, M.R.; Martín, J.M.B.; Rubín, M.J.D. La Estrategia de Turismo Sostenible de Andalucía: Elementos fundamentales en el marco de la planificación turística subregional andaluza. Rev. Estud. Reg. 2013, 97, 77–111. [Google Scholar]
  53. Redondo, R.Q.; Oliva, M.P.; Gonzalo, S.B. El uso de la imagen en Twitter durante la campaña electoral municipal de 2015 en España. Rev. Lat. Comun. Soc. 2016, 71, 85–107. [Google Scholar] [CrossRef] [Green Version]
  54. Baggio, R. Network science and tourism—The state of the art. Tour. Rev. 2017, 72, 120–131. [Google Scholar] [CrossRef]
  55. Araña, J.E.; Leon, C.J. Understanding the use of non-compensatory decision rules in discrete choice experiments: The role of emotions. Ecol. Econ. 2009, 68, 2316–2326. [Google Scholar] [CrossRef]
  56. Varia, M.; Wilson, S.; Sarwal, S.; McGeer, A.; Gournis, E.; Galanis, E.; Henry, B.; Hospital Outbreak Investigation Team. Investigation of a nosocomial outbreak of severe acute respiratory syndrome (SARS) in Toronto, Canada. CMAJ 2003, 169, 285–292. [Google Scholar]
  57. Zhu, Y.; Fu, K.-W. Speaking up or staying silent? Examining the influences of censorship and behavioral contagion on opinion (non-) expression in China. New Media Soc. 2020, 1–22. [Google Scholar] [CrossRef]
  58. Lee, Y.; Bang, S.; Lee, I.; Kim, Y.; Kim, G.; Ghaed, M.; Pannuto, P.; Dutta, P.; Sylvester, D.; Blaauw, D. A modular 1 mm3 die-stacked sensing platform with low power I2C inter-die communication and multi-modal energy harvesting. IEEE J. Solid-State Circuits 2012, 48, 229–243. [Google Scholar] [CrossRef]
  59. Jung, E.H.; Jeon, N.J.; Park, E.Y.; Moon, C.S.; Shin, T.J.; Yang, T.-Y.; Noh, J.H.; Seo, J. Efficient, stable and scalable perovskite solar cells using poly (3-hexylthiophene). Nature 2019, 567, 511–515. [Google Scholar] [CrossRef] [PubMed]
  60. Shi, S.; Wang, Z.; Shi, J.; Wang, X.; Li, H. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network. J. Latex Class Files 2020, 14, 1–17. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  61. Jha, D.; Ward, L.; Paul, A.; Liao, W.-K.; Choudhary, A.; Wolverton, C.; Agrawal, A. Elemnet: Deep learning the chemistry of materials from only elemental composition. Sci. Rep. 2018, 8, 17593. [Google Scholar] [CrossRef]
  62. Ritchie, W.B. Chaos, crises and disasters: A strategic approach to crisis management in the tourism industry. Tour. Manag. 2004, 25, 669–683. [Google Scholar] [CrossRef]
  63. Canalis, E.; Grossman, T.R.; Carrer, M.; Schilling, L.; Yu, J. Antisense oligonucleotides targeting Notch2 ameliorate the osteopenic phenotype in a mouse model of Hajdu-Cheney syndrome. J. Biol. Chem. 2020, 295, 3952–3964. [Google Scholar] [CrossRef]
  64. Ahmar, A.S.; Del Val, E.B. SutteARIMA: Short-term forecasting method, a case: Covid-19 and stockmarket in Spain. Sci. Total Environ. 2020, 729, 138883. [Google Scholar] [CrossRef]
  65. ECTA. Encuesta de Coyuntura Turística de Andalucía; Instituto de Estadística y Cartografía de Andalucía: Sevilla, Spain, 2020. [Google Scholar]
  66. Miedes-Ugarte, B.; Flores-Ruiz, D.; Wanner, P. Managing Tourist Destinations According to the Principles of the Social Economy: The Case of the Les Oiseaux de Passage Cooperative Platform. Sustainability 2020, 12, 4837. [Google Scholar] [CrossRef]
  67. Perogil Burgos, J. Turismo solidario y turismo responsable, aproximación a su marco teórico y conexiones con la inteligencia territorial. Rev. Iberoam. Econ. Solidar. Innovación Socioecológica (RIESISE) 2018, 1, 23–48. [Google Scholar] [CrossRef]
Figure 1. Knime modelling. Source: elaborated by authors.
Figure 1. Knime modelling. Source: elaborated by authors.
Sustainability 13 03836 g001
Figure 2. Data processing model. Source: elaborated by authors.
Figure 2. Data processing model. Source: elaborated by authors.
Sustainability 13 03836 g002
Figure 3. Word cloud showing words used by tourists in 2019. Source: elaborated by authors.
Figure 3. Word cloud showing words used by tourists in 2019. Source: elaborated by authors.
Sustainability 13 03836 g003
Figure 4. Word cloud showing words used by tourists on Twitter in the third quarter of 2020. Source: elaborated by authors.
Figure 4. Word cloud showing words used by tourists on Twitter in the third quarter of 2020. Source: elaborated by authors.
Sustainability 13 03836 g004
Figure 5. Most popular words in tweets in 2019. Source: elaborated by authors.
Figure 5. Most popular words in tweets in 2019. Source: elaborated by authors.
Sustainability 13 03836 g005
Figure 6. Most popular words in tweets in 2020. Source: elaborated by authors.
Figure 6. Most popular words in tweets in 2020. Source: elaborated by authors.
Sustainability 13 03836 g006
Table 1. Studies using sentiment analysis in tourism research (2019–2020).
Table 1. Studies using sentiment analysis in tourism research (2019–2020).
AuthorsTitleObjectiveMethodology
[42]Exploring best practices for online engagement via Facebook with local destination management organisations (DMOs) in Europe: a longitudinal analysisTo supply evidence of a positive trend in online engagement among tourists.
(Tourists)
The study is based on the use of Facebook pages by the destination management organisations in question.
[9]City characteristics that attract AirBnB travellers:
evidence from Europe
To determine the characteristics prioritised by customers and draw up a typology of cities from the traveller’s perspective.
(Tourists)
Data collection and most of the analysis were carried out using R, a very flexible method and trend programme offering specific packages for data capture and mining.
[39]What do people think about this monument? Understanding negative reviews via deep learning, clustering, and descriptive rulesTo collate negative opinions about three cultural monuments to detect the characteristics in need of improvement.
(Tourists)
A deep learning method based on a CNN and SD methods for aggregating information was used.
[44]Business information architecture for successful
project implementation based on sentiment analysis in the tourist sector
To provide an architecture of principles, a strategy to meet
the needs of tourism companies in the Peruvian market, so that when problems arise in tourism management processes, there are good practices available to improve these processes
and develop technological solutions.
(Methodological)
Due to the current situation of tourism companies and the use of cloud services, Google and services such as Cloud Data Store API and Machine Learning are used as a case study due to the need for a platform for developing the solution.
[18]A machine learning approach for the identification of deceptive reviews in the hospitality sector using unique attributes and sentiment orientationTo identify differences and characteristics allowing deceptive and truthful reviews to be successfully classified using a text-based machine learning approach.
(Methodological)
A text-based machine learning approach provides an automatic tool capable of processing a large volume of reviews.
[28]Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sectorTo compile a bilingual corpus (Spanish-English) of user opinions in the Andalusian tourism sector, provisionally entitled SentiTur.
(Methodological)
Tourist destinations were downloaded from the TripAdvisor website using a custom scraper built from the infrastructure provided by the Scrapy tool.
[12]Inconsistencies in TripAdvisor reviews: a unified index between users and sentiment analysis methodsThe study analyses opinions in six reviews of Italian and Spanish monuments and detects inconsistencies between sentiment analysis methods and user polarity methods that automatically extract polarities.
(Methodological)
TripAdvisor is used as a data source. Results showing inconsistencies between polarities are presented, before the Polarity Aggregation Model is proposed to address this issue, and its outcomes are assessed using an aspect extraction approach.
[21]Semantic analysis in social media for digital tourism communicationTo establish a methodology for ascertaining whether consumer opinion has a positive or negative effect on recommending tourism services and attracting customers.
(Methodological)
The quantitative part of the study consists of quantifying and comparing data from tourism companies in terms of numbers of followers, likes, comments, and shares on the social network Facebook.
[41]A proposal for sentiment analysis on Twitter for tourism-based applicationsTo create a structure based on independent, interchangeable components to allow research to be conducted in a more uniform, open, and transparent manner.
(Methodological)
The study focuses on comments about hotels, proposing a platform that classifies tweets as positive, negative or neutral based on the author’s opinion.
[23]Using deep learning to predict sentiments: a case study in tourismTo use different deep learning techniques and architectures to address the issue of classifying comments posted by tourists online, which are used by other tourists to inform their decision-making.(Methodological)To extract the information, scripts were developed in Python based on the Scrapy framework, and information from reviews of hotels on the island of Tenerife in English was extracted from the websites.
Source: elaborated by authors.
Table 2. Data extraction strategies.
Table 2. Data extraction strategies.
Search word“mi viaje”, “mi experiencia”, “mis vacaciones”, “como turista”, “como visitante”, “como viajero”
Number of tweets 202014,000
Number of tweets 201911,532
Total no. of tweets analysed 25,532
Subject areaALL
Text cleaning(“(RT|via)((?:\\b\\W*@\\w+)+”@\\w+”“http\\w+”“[í½,~€Ä|ÃáÃ3¾]”)
PeriodAugust, September, October and November 2019
August, September, October and November 2020
LanguageEnglish, Spanish
Query stringbusqueda<-searchTwitter(“Andalucia”,geocode = “36.72016, −4.42034, 500 km”, n = 100,000, lang = “es”)
Search dateOctober 2020
Source: elaborated by authors.
Table 3. Classification of sentiments in 2019.
Table 3. Classification of sentiments in 2019.
ClassSentiments
Neu21%
Pos73%
Neg6%
Total100%
Source: elaborated by authors.
Table 4. Classification of sentiments in 2020.
Table 4. Classification of sentiments in 2020.
Class Sentiments
Neu12%
Pos30%
Neg58%
Total100%
Source: elaborated by authors.
Table 5. Tourists’ opinions via Twitter (2019).
Table 5. Tourists’ opinions via Twitter (2019).
01#My experience, if you want beautiful beaches, Cádiz and Huelva, personally I prefer Cádiz.
02@As a tourist I recommend the area of Conil and Zahara de los Atunes (Cala de los Alemanes) if you want to sunbathe peacefully in the nude.
03#As a tourist I recommend exploring Almería. There are amazing beaches without many people, not only at Cabo de Gata but also in the El Toyo area, a kind of Olympic village.
04@As a visitor I recommend visiting Almuñécar, a paradise for enjoying untouched nature. It has crystal-clear waters and marine life.
05#Spectacular place. Crystal-clear waters. There are two beach bars. I really recommend Bola Marina. The service is wonderful, very polite and friendly waiters.
06@My experience great option for a beach day between Nerja and Almuñécar.
07#As a tourist the nudist beach where you can wear clothes, a very peaceful atmosphere and an ideal place for snorkelling (it’s in the Maro-Cerro Gordo nature area).
08@As a traveller I recommend you visit Playa Virgen in La Herradura. Beautiful beach with crystal-clear waters. Careful, there are jellyfish. It’s got pebbles like all the beaches in the area. There are sun loungers.
09#As a traveller No doubt about it, La Malagueta, one of the best beaches on Málaga’s tropical coast. It’s classified as a nudist beach by the Andalusian government.
10@My experience. The best known beach in Málaga city, La Malagueta. It’s a very clean beach, both the sand and the water.
Source: elaborated by authors.
Table 6. Tourists’ opinions via Twitter (2020).
Table 6. Tourists’ opinions via Twitter (2020).
01#As a tourist I recommend visiting a stunning landscape from the top of the Alpujarra. It was an incredible experience. I loved it and I’m sure you won’t be disappointed if you visit.
02@As a visitor I recommend visiting the Sierra de Aracena and Picos de Aroche natural park. I’d never visited at this time of year. We stayed at ‘Finca la Media Legua’ and it was an unforgettable experience for all the family. Places to visit, charming villages, local gastronomy, it was all great.
03@It was all so beautiful, the biggest cathedral in the world after the Vatican. It’s one of the most visited monuments in Seville. It’s the biggest religious building in Seville.
04#The Alhambra is... I don’t know, I can’t describe it. It’s magical! Every spot, every corner, the garden, even the souvenir shop... Incredible! Sadly I can’t say which bit I liked the most because I loved it all, the Nasrid Palaces are breathtaking.
05@Next to the Roman theatre is the entrance to the Alcazaba, which is in a very good state of repair. You can walk around it and enjoy stunning views over this wonderful fishing city.
06#I love this market, Altarazanas market in Malaga. You can buy everything, from land and sea. Amazing vegetables! The fish, oh my God, it’s so fresh and the fruit is wonderful. You can eat at the bars inside too, where everything’s super fresh and delicious.
07@ Málaga Automobile and Fashion Museum Highly recommended for a fun outing either with or without children. Good collection, well-spaced exhibits and no crowds.
08#I loved my trip. The weekend in Córdoba was wonderful, we were really lucky to have chosen a guided tour of the Mezquita with Konexion Tour and it was brilliant.
09@ My experience. Last week I did a tour from Seville to Córdoba and Carmona. Highly recommended.
10#Gibralfaro Castle The castle has a wall rising up over the city, it’s worth climbing it for the stunning views of the city and the surrounding area.
Source: elaborated by authors.
Table 7. Matrix of words appearing at least 10 times in 2019.
Table 7. Matrix of words appearing at least 10 times in 2019.
[1] “Beach”“August” “Night” “Comment” “All”
[6] “Sand” “hi” “message” “felt” “Sunday”
[11] “available” “discover” “colours” “summer” “Almería”
[16] “buy” “Huelva” “thanks” “account” “days”
[21] “order” “beach” “Cádiz” “new” “tones”
[26] “Alicante” “client” “new” “direct” “summer”
[31] “shop” “pool” “Almuñécar” “water” “people”
[36] “any” “Málaga” “art” “summer” “photos”
Source: elaborated by authors.
Table 8. Matrix of words appearing at least 10 times in 2020.
Table 8. Matrix of words appearing at least 10 times in 2020.
[1] “culture”“Málaga”“night”“comment”“all”
[6] “Seville”“Mezquita”“message”“felt”“Sunday”
[11] “available”“discover”“colours”“summer”“Córdoba”
[16] “buy”“Cathedral”“thanks”“account”“Alhambra”
[21] “order”“culture”“Granada”“new”“tones”
[26] “city”“client”“new”“direct”“heritage”
[31] “shop”“second”“autumn”“water”“people”
[36] “any”“museums”“art”“summer”“photos”
Source: elaborated by authors.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Flores-Ruiz, D.; Elizondo-Salto, A.; Barroso-González, M.d.l.O. Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic. Sustainability 2021, 13, 3836. https://doi.org/10.3390/su13073836

AMA Style

Flores-Ruiz D, Elizondo-Salto A, Barroso-González MdlO. Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic. Sustainability. 2021; 13(7):3836. https://doi.org/10.3390/su13073836

Chicago/Turabian Style

Flores-Ruiz, David, Adolfo Elizondo-Salto, and María de la O. Barroso-González. 2021. "Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic" Sustainability 13, no. 7: 3836. https://doi.org/10.3390/su13073836

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop