1.1. Background
Urban parks play a vital role in cities and their significance has continuously evolved in the lives of city dwellers. The benefits of urban parks include environmental benefits such as biodiversity and local cooling, economic benefits such as energy savings and property value, and social and psychological benefits such as physical activity and reduced obesity [
1,
2]. One of the important topics of park-related research is humans’ events and programs in parks. Many studies have shown how park events could become a deciding force in shifting the park’s own functionality [
3,
4,
5,
6]. In a report investigating London’s urban parks, Smith and Vodicka [
3] summarized from accounts of Friends groups that events are seen as a promotion of the park’s inclusivity that brings more people into the park, contributing to community cohesion. A similar study by Neal et al. [
6] on parks also credits urban park events as an opportunity of inclusivity, as organized events present a more ethnically diverse population than regular park users. Citroni and Karrholm [
7] mentioned the relationship of events to civility, and how events facilitate the visibility of everyday life and forged a pattern of urban civility. Studying events in urban parks provides an insight for us to understand how these parks could actively contribute to a city and its community and help us reach a more sustainable city with high quality of life.
There is a significant gap between existing works and efficient event analysis of the parks. First, most of the past studies about park event analysis have been focusing on the intensity of park use, demographics of park users, the periods of time parks are used, and the level of physical activities. However, few studies have been focusing on the categorization of park events and programs, Secondly, from the aspect of data source, a majority of current studies analyzing the categories of human activities and planned events in parks have relied on mass questionnaires and interviews [
8,
9,
10,
11], which is time consuming and site restrictive. Recent technological methods introduce big data into detailed park use analysis, such as GPS data and public participation geographic information systems data. However, GPS-based mobile phone tracking is not informative to the categorization of events and recreational park use [
12], and public participation geographic information systems (PPGIS) cannot guarantee data sufficiency [
13]. Social media data and other publicly available online imagery are a good source of information regarding recreational use of parks. Thirdly, from the aspect of methodology the methods of existing studies are either inefficient or not specifically targeted towards park events. Recent studies that utilize publicly available online imagery still involve tedious manual classifications [
12]. The current research status calls for an updated methodology of a more accessible and cost-effective urban park events category analysis.
With the New York City Parks Events Listing [
14] data which is a set of publicly available, tagged image data, this study proposes an algorithm featuring deep learning methods to more efficiently identify events and programming in urban parks by analyzing publicly available images of these parks, and performing classification based on park events. This is for the purpose of helping urban researchers and planners to better understand the impacts of park events in the community, and further incorporate them into the decision-making process.
1.2. Related Works
Although a significant number of studies have been conducted to determine the use of urban parks, the majority of these studies focused quantitatively on the frequency or intensity of use [
15,
16,
17,
18,
19,
20]. Some emerging studies deploy crowd sourcing survey to effectively collect public opinions (emotions and perceptions) on urban parks and public spaces [
21,
22,
23]. Some studies also investigated the demographics of park users [
24,
25,
26], and the periods of time parks are used [
24].
Regarding park activities, although a considerate number of studies have investigated the level of physical activity in parks [
27,
25], they were only identified simple events like sedentary, walking, or vigorous. Some studies went beyond this simple categorization and embodied a wider range of park activities [
28,
29]. However, more studies can still be done on a more fine-grained categorization of activities, as well as on activities driven by organized events as opposed to day-to-day activities such as walking or jogging.
Lastly, it is also worth noting that many past studies on the use of urban parks focused on quantitatively examining the relationship between certain variables and the intensity of use. The independent variables examined include park proximity [
15,
16], park facilities [
15], park quality [
30], entrance fees [
17], and social demographic characteristics of the neighborhood [
15,
17].
For the data source and methodology, traditional studies rely heavily on questionnaires and personal interviews. For instance, Schipperijn et al. [
8] conducted 14,566 face-to-face interviews with randomly-sampled Danish individuals, and asked these individuals to fill out follow-up questionnaires. Peschardt et al. [
31] distributed 686 on-site questionnaires at nine small public urban green spaces to determine how these spaces were used by citizens. Nielsen and Hansen [
16] mailed questionnaires to a sample of 2000 adult Danes. Other studies were conducted through direct observations in the parks. For example, many studies, such as the ones by Marquet et al. [
20] and Veitch et al. [
32], employed the System for Observing Play and Recreation in Communities (SOPARC) [
33] to directly observe residents’ activities in parks. Similarly, Floyd et al. [
25] measured physical activities in parks using a modified version of the System for Observing Play and Leisure Activity in Youth (SOPLAY). Brown et al. [
28] used participatory GIS to investigate physical activities in urban parks. Overall, the application of traditional methods to understand park usages and park events is highly time consuming and restrained to smaller areas due to the site-specificity [
18].
Recent studies have been incorporating technologies to better understand the use of parks, both through utilizing novel online data sources and more efficient categorization. Commonly used novel data sources include social media data, geo-tracking data from mobile phones, and PPGIS data. For instance, Li et al. [
18] retrieved geo-tagged social media check-in records for park visits to examine the frequency of visits. A bivariate correlation analysis was conducted to support the association between the Weibo check-in data and official visitor statistics, although the strength of correlation ranges from city to city. Larson et al. [
19] used geo-tracking data from cell phones to document changes in park visits during the COVID-19 pandemic. Heikinheimo et al. [
12] compared four types of data (social media, sports tracking, mobile phone operator and PPGIS data) in a case study of Helsinki, Finland, and examined the ability of these user-generated datasets to provide information on the use of urban parks.
To compare, social media data is highly informative for the leisure time activities being conducted in urban parks [
12], but is limited by biases in age groups and the choice to share content publicly [
34]; mobile phone data highlights movements [
12], but only best represents populations in countries where mobile phones are widely used [
35]; PPGIS allows the researcher to ask in-depth questions on park use and preferences [
12], but the response rate and its fairness are not guaranteed [
13].
For categorization methods, the content analysis of social media data in Heikinheimo’s study was done through manual classification of 15,312 Instagram photos and 1,843 Flickr photos. This is again time-consuming and inefficient, and calls for a more automatic method of analyzing social media content on park activities. To compare the best-known commercial image recognition service providers on this task, Ghermandi et al. [
29] performed a test using Google Cloud Vision [
36], Clarifai [
37], and Microsoft Azure Computer Vision [
38] to identify human-nature interactions (outdoor recreational activities, biophysical environments, and feelings) in parks. All of these models surpass traditional methods in the efficiency of categorization. However, due to the generic nature of the image recognition services, the tags identified in regards to recreational activities are relatively limited, without sufficient specificity to park-related, event-driven activities. For example, all three services identified people posing for a photograph as the most frequent activity captured in social media imagery. Another precedent to this study is Matasov et al.’s study on COVID-19’s impact on the recreational use of Moscow parks, which applied the YOLOv5x neural network to conduct object detection on geo-tagged social media photos.
In conclusion, there are three research gaps in the existing research:
- (1)
Current studies focus more on the intensity of park usage and level of physical activities (sedentary, walking, vigorous), leaving a gap for more fine-grained studies in the categorization of park events;
- (2)
For the methodology, traditional studies rely heavily on questionnaires and personal interviews, which is time consuming and restricted;
- (3)
In recent studies that incorporate technologies, the categorization methods are either inefficient or not specific to park events.
To fill the current research gaps, this study contributes to the literature in these following ways:
- (1)
By focusing the analysis on the categorization of park events;
- (2)
By incorporating the use of publicly available imagery to increase the efficiency of analysis;
- (3)
By proposing transfer learning on pre-trained Convolutional Neural Networks (CNNs) to calibrate the model towards the park event identification task, achieving a 0.876 accuracy and a 0.620 mean average precision.