Submitted:
06 August 2023
Posted:
08 August 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Tourism in Irkutsk Oblast
3. The concept of tourism monitoring service
-
Real-time information on socio-ecological-economic indicators of regional tourism (a wider set of indicators than official statistics can be collected in real time, unlike data collection through administrative management):
-
Development of the tourism profile of the territory (region, district, locality):
- types of tourism (recreational, event, ecological, ethnographic, active, business, rural, children’s, water and cruise, social tourism, amateur, health and wellness, extreme, pilgrimage, cultural-cognitive, gastronomic, etc.),
- brands of the territory, attractions (museums, religious sites, objects of natural and wildlife reserves, hunting and fishing sites, rural, industrial, business, military-patriotic tourism objects, ski resorts),
- sites for nature observation,
- tourist events,
- tourist routes,
- ecological tourist paths,
- excursions,
- tourist information centers,
- accommodation,
- transport services, transport accessibility,
- dining facilities,
- climate,
-
operational monitoring (including informal tourism)
- types of tourist services (products),
- tourist flows, their direction and density,
- tourist routes,
- reviews to identify their sentiment and issues in the tourism industry (environmental, transportation, quality of services, safety),
- recreational (anthropogenic) load on the territory, determining the environmental risk,
- population involved in tourism (number of jobs, education),
- region’s (district’s, locality’s) rating,
- impact of extreme events (pandemics, forest fires - presence of smoke, debris),
-
monitoring:
- identification of informal recreational areas based on satellite imagery (tent placement),
- identification of tourist accommodation locations not registered in the official registry based on satellite imagery,
- analysis of the popularity of landmarks,
-
zoning:
- specialization of tourism types,
- density of tourist flows,
- recreational load.
-
-
Decision support for small and medium-sized businesses:
- determination of business territories,
- selection, justification, and development of tourism products for business.
-
Decision support for tourists:
- selection of tourist products.
4. Information sources
5. Ontology
6. Methods for data collection and processing
6.1. Data collection methods
- Online surveys;
- Queries to databases;
- API (Application Programming Interface) - an interface for exchanging data between applications;
- Web scraping.
- Extracting the markup of the web source page. This is done via HTTP requests to the resource and saving the retrieved data in a variable or HTML code files of the webpage;
- Extracting information from the HTML code structure. This is based on searching for specified paths to the markup elements: tag names, attributes, and their values;
- Saving the data in a structured format and further processing;
- Optionally: repeating the actions;
- Within the scope of the present research, the data collection implementation can be described as gathering links to the necessary objects and then traversing them to extract detailed information and record it in a table about the objects. Implementation of the method’s algorithm has some special features, such as its uniqueness for each data source and the necessity of making changes over time, since the website markup may be changed by its developers.
6.2. Methods for data cleaning and integration
6.3. Methods for Text Analysis (Reviews)
- Data preparation for analysis includes excluding foreign texts or translating them, for example using the Translator module [26];
- Text tokenization: splitting the text into individual words, excluding all other elements (punctuation marks, emojis, and other symbols). It is performed using regular expressions and specialized methods;
- Text lemmatization: bringing words to their base form or stemming, i.e., extracting the word stem. Lemmatization is carried out using the Pymorphy2 library, which also allows for morphological analysis (part-of-speech tagging), i.e., it determines parts of speech. As a result, prepositions are excluded from further analysis;
- Removing stop words (commonly used words that do not carry significant information);
- Creating a dictionary specific to the task at hand, which allows for semantic analysis of the text to identify entities, actions, descriptions, and their connections to locations. All this forms knowledge base about flora and fauna, infrastructure objects, and other territorial characteristics;
- Sentiment analysis of the text: determining the sentiment based on classifying characteristics into negative and positive classes, allowing for additional identification of issues and preferences related to locations. This is done using a method that converts tokens into numerical vectors (embedding), followed by a classification through a neural network.
6.4. Data visualization methods
7. Implementation of monitoring service
8. The results of data collection and analysis
-
for accommodation facilities and their services that can be filtered by rating (from poor to excellent), settlements/districts, and service category (Figure 9):
- displaying the number of accommodation facilities and average cost by settlements/districts in the form of a combined chart,
- displaying accommodation facilities as a map of proportional objects by room stock (number of rooms),
- displaying the number of accommodation facilities by category/subcategory of services in the form of a bar chart and heat map,
-
for catering establishments that can be filtered by districts and cuisine type (Figure 10):
- displaying a map of proportional catering establishments based on the average check and cuisine type,
- displaying aggregated quantitative indicators by settlements/districts to describe areas according to a specified measure (expensive-cheap, by the number of catering establishments) in a combined chart format,
- displaying a heat map based on the number of catering establishments in different districts and cuisine types,
-
for landmarks and popular recreational areas:
- displaying landmarks on a map with the option to obtain descriptive information about them,
- displaying the density of landmark points on a map,
- displaying popular recreational areas on a map,
-
for tourist routes and excursions that can be filtered by route name:
- displaying settlements on a map where tourist routes or excursions are organized,
- displaying route and excursion diagrams on a map.
9. Discussion and conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- A United Nations Specialized Agency of World Tourism Organization. Available online: https://www.unwto.org/sustainable-development (accessed on 22 March 2023).
- International Network of Sustainable Tourism Observatories of World Tourism Organization. Available online: https://www.unwto.org/insto/observatories/ (accessed on 22 March 2023).
- SHAPETOURISM OBSERVATORY. Available online: https://www.quantitas.it/data/shapetourism/build/index.php (accessed on 12 January 2023).
- Destination NSW. Available online: https://www.destinationnsw.com.au/about-us (accessed on 15 January 2023).
- Lobao, F.; Aparicio, M.; Neto, M. SMART TOURISM -CITY TOURISM RADAR: A Tourism Monitoring Tool at the City of Lisbon. In Proceedings of the 19.the Conference of the Portuguese Association of Information Systems, CAPSI ’2019, Lisbon, Portugal, 2019; pp. 1–21. [Google Scholar]
- Li, H. , Hu, M., Li, G. Forecasting tourism demand with multisource big data. Ann. Tour. Res. 2020, 83, 102912. [Google Scholar] [CrossRef]
- Soualah-Alila, F.; Coustaty, M.; Rempulski, N.; Doucet, A. DataTourism: Designing an Architecture to Process Tourism Data. In Information and Communication Technologies in Tourism; Inversini, A., Schegg, R., Eds.; Springer: Cham, Switzerland, 2016. [Google Scholar] [CrossRef]
- Rubtsova, N.V. Formation of the System for Monitoring the Efficiency of Regional Tourist-Recreational Services. World Econ. Manag. 2019, 19, 101–110. [Google Scholar] [CrossRef]
- Li, J.; Xu, L.; Tang, L.; Wang, S.; Li, L. Big data in tourism research: A literature review. Tour. Manag. 2018, 68, 301–323. [Google Scholar] [CrossRef]
- Wikipedia. Irkutsk Oblast. Available online: https://en.wikipedia.org/wiki/Irkutsk_Oblast (accessed on 22 March 2023).
- Wikipedia. Lake Baikal. Available online: https://en.wikipedia.org/wiki/Lake_Baikal (accessed on 22 March 2023).
- The Irkutsk region took 15th place in the National Tourist Rating. Available online: http://www.irk.ru/news/20220113/rating/ (accessed on 15 May 2023).
- Tourism. Federal State Statistics Service. Available online: https://rosstat.gov.ru/statistics/turizm (accessed on 10 May 2023).
- AWS. Available online: https://aws.amazon.com (accessed on 10 December 2022).
- Google Cloud. Available online: https://cloud.google.com (accessed on 10 December 2022).
- Enterprise Cloud from, VK. Available online:. Available online: https://mcs.mail.ru/cloud-platform (accessed on 10 December 2022).
- Yandex Cloud. Available online: https://cloud.yandex.ru (accessed on 10 December 2022).
- Kotelnikov, D. A. Formation of a system of indicators for monitoring the sustainable development of tourist areas. In Competition of scientific innovations: prospects for the development of science in the modern world: Collection of articles based on the materials of the All-Russian research competition; Limited Liability Company “Scientific Publishing Center “Herald of Science”: Ufa, Russia, 2020; pp. 41–50. [Google Scholar]
- Lebedeva, Y.A. Organization of monitoring of the quality of tourist services at the municipal level: monograph. – 2020; Publishing House “Sreda”: Cheboksary, Russia, 2020. [Google Scholar] [CrossRef]
- Garrido, P. , Barrachina, J., Martinez, F. J., Seron, F. J. Smart Tourist Information Points by Combining Agents, Semantics and AI Techniques. Comput. Sci. Inf. Syst. 2017, 14, 1–23. [Google Scholar] [CrossRef]
- Mendoza-Moreno, J.F. , Santamaria-Granados, L., Fraga, V.A., Ramirez-Gonzalez, G. OntoTouTra: Tourist Traceability Ontology Based on Big Data Analytics. Appl. Sci. 2021, 11, 11061. [Google Scholar] [CrossRef]
- Pai, M.-Y. , Wang, D.-C., Hsu, T.-H., Lin, G.-Y., Chen, C.-C. On Ontology-Based Tourist Knowledge Representation and Recommendation. Appl. Sci. 2019, 9, 5097. [Google Scholar] [CrossRef]
- UNESCO Thesaurus. Available online: https://vocabularies.unesco.org/browser/thesaurus/ru/ (accessed on 20 May 2023).
- Mitchell, R. Web Scraping with Python: Collecting Data from the Modern. Web, 2nd ed.; O’Reilly Media, 2018. [Google Scholar]
- Moskalenko, A.A. , Laponina, O. R., Sukhomlin V.A. System for managing access to web application resources based on user behavior analysis. Int. J. Open Inf. Technol. 2020, 8, 30–35. [Google Scholar]
- Github.com. Available online: https://github.com/UlionTse/translators (accessed on 01 September 2022).
- Stanza. Available online: https://stanfordnlp.github.io/stanza/ (accessed on 01 September 2022).
- Qi, P.; , Zhang, Y., Zhang, Y.J., Bolton, C.D. Manning Stanza: A Python Natural Language Processing Toolit for Many Human Languages. Available online: https://arxiv.org/pdf/2003.07082.pdf (accessed on 11 July 2023).
- Yandex Cloud. Cloud Functions comparison with other Yandex Cloud services. Available online: https://cloud.yandex.com/en/docs/functions/service-comparison (accessed on 11 July 2023).
- Yandex Cloud. Message queues. Available online: https://cloud.yandex.com/en/docs/message-queue/concepts/queue (accessed on 11 July 2023).
- Yandex Cloud. Resource relationships in API Gateway. Available online: https://cloud.yandex.com/en/docs/api-gateway/concepts/ (accessed on 11 July 2023).
- Pavlov, A.I.; Stolbov, A.B.; Lempert, A.A. Towards extensibility features of knowledge-based systems development platform. In 4th Scientific-Practical Workshop Information Technologies: Algorithms, Models, Systems; Bychkov, I.V., Karastoyanov, D., Eds.; CEUR Workshop Proceedings, 2021; Volume 2984, pp. 87–94.










Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).