1. Introduction
Information technology services have profoundly transformed the economy and business world by creating new industries and fields of production and services that were unimaginable even 30 years ago. E-commerce, in particular, has changed the way companies operate, customers behave and need, and business transactions occur [
1]. According to a recent report by Lipsman, e-commerce sales of retail goods worldwide will reach
$6.5 trillion by 2023, showing an increase in online shopping trips. The market share of e-commerce in the total retail market is expected to grow from 10.4% to 22% (with sales of approximately
$2.4 trillion) from 2017 to 2023. This issue has a huge impact on transportation management and, from a logistics perspective, affects the delivery time and cost for customers [
2]. The main challenge is still the choice between online or offline shopping trips. For example, in Korea, large retail companies such as Shinsegae Group (i.e., E-mart, Shinsegae Department Store) and Lotte Group (i.e., Lotte Mart, Lotte Department Store) that dominate the retail industry in Korea have high competition in online shopping and have created offline stores. However, although Lotte Mart (offline store) and Lotte Mart Mall (online retail) are both from the same product sales category, they compete strongly in terms of product sales [
3]. Online product buyers often require fast delivery, so their orders should be processed as soon as they are registered. All the operations needed to deliver orders to customers are compressed into a short time span (order processing and pickup, long-distance shipping and delivery). Therefore, what usually happens is that orders enter the distribution and dispatching system when other (previous) orders have already started the distribution process, and thus need to be integrated into the delivery plan [
1]. In online shopping, a wide range of orders are made based on the location of the customers using the logistics system. The most important issue is the delivery and travel time of the ordered products to the farthest place of the customer, which sometimes causes customer dissatisfaction due to possible delays or the failure of the product to arrive at a certain time.
This issue poses many challenges in the field of choosing between online or offline shopping trips. In a general category, the advantages of online shopping from the customers' perspective include saving time, reducing travel costs, using product discounts, buying at any hour of the day, avoiding queues, escaping crowds in the stores, having easy access to the list of desired products with their specifications, and making informed purchases. On the other hand, online shopping has disadvantages such as delayed delivery, lack of physical touch of the product, delivery cost, and the long process of returning the product, which make many buyers prefer offline shopping from the nearest store. Naturally, each of these two types of shopping (online and offline) requires the evaluation of shopping trips. Based on this, it is necessary to predict the type of shopping trip that can be made by the customer (offline trips) or companies providing internet products (online trips) by measuring the factors influencing the customer's choice of online or offline shopping. This shows the importance of conducting this research and the necessity of using methods based on artificial intelligence and machine learning.
An important point that highlights the need for this issue in practical applications is that many retailers have both online and offline sales channels. Based on this, the products offered through different channels show different demand patterns, which is a challenging issue to predict the purchase amount in both cases. Because inventory replenishment cycle times for products typically range from one week to one month, the retailer needs weekly and monthly forecasts so that he can prepare for both offline visits (sufficient inventory) and online orders (ready transportation system). The travel decision-making process is of great importance in transportation planning, and its application based on effective information and detailed analysis can be used as a predictive indicator for future development. The importance of travel production in transportation demand management has led to extensive studies for different travel purposes. trips due to the daily activities of citizens are divided into different purposes such as work, shopping, education, recreation, etc. According to the studies conducted in Tehran, approximately 15% - 18% of trips are for shopping purposes [
4], which shows the importance of this type of trip after work trips for the studies in question. By identifying and prioritizing the factors that affect the creation of online and offline shopping trips, we can play a significant role in reducing transportation costs, pollutant emissions, urban traffic, user satisfaction, and sustainable development. According to surveys conducted in 24 countries such as India, China, USA, Germany and Japan, it was found that on average, only 10% of trips are made for online shopping and 90% of trips are made for offline shopping in one day [
4].
Tehran is the most populous city and the capital of Iran, with a population of over 9 million people. According to the 2018 estimate of the United Nations, it is the 34th most populous city in the world and the most populous city in West Asia. Tehran metropolis is also the second most populated metropolis in the Middle East. Due to the specific style of modern and traditional life in Tehran, the types of shopping trips in this city are diverse [
5].
Table 1 shows the number of online and offline shopping trips in one day in Tehran [
4].
The type of shopping trip is an important factor that varies depending on whether the shopping is online or offline. For online shopping, the logistics system plays a crucial role in the product supply chain of Internet companies. The logistics system involves the transfer, movement, processing and access to logistics information for the integration of transportation, ordering and manufacturing processes, order changes, production scheduling, logistics plans and warehousing operations. For offline shopping, the type of travel and the transportation system used by customers have economic and environmental implications. Therefore, estimating the type of travel is the main topic of this paper. Due to the large volume of data related to shopping trips in both online and offline shopping, it is necessary to adopt new methods based on artificial intelligence technologies and machine learning algorithms. The main objective of this paper is to use a machine learning approach, specifically deep learning, to provide a travel prediction model for online and offline shopping in Tehran, after identifying the factors that influence the creation of trips.
This paper is structured as follows:
Section 2 provides a review of the literature and previous research on estimating the type of shopping trip.
Section 3 describes the research method, which consists of a deep learning approach and its steps.
Section 4 introduces the dataset used in this study.
Section 5 presents the results obtained by applying the proposed method and compares them with other methods. Section 6 concludes the paper and discusses the implications and limitations of the research.
2. Literature Review
In recent years, various studies have examined customer behavior to determine the type of shopping trip, especially in light of the growing demand for e-commerce and online shopping. On the other hand, environmental concerns have also increased due to car emissions, which partly result from trip for offline shopping.
Shao et al. [
6] assessed the effects of physical and virtual accessibility on e-commerce based on the geographic location of buyers. They used a spatial autoregressive model (SAC) to examine how physical and virtual accessibility influence the spatial distribution of online shopping trip in 276 provincial-level cities in China. The results indicate that both physical access (measured by the relative number of shopping centers and public transportation system) and virtual access (measured by the percentage of broadband subscribers and the relative number of delivery points) enhance online shopping trips.
Dong et al. [
7] applied a machine learning approach to estimate customer behavior for a large multipurpose online store between October and November 2019. They found that the pipeline and random forest algorithms had the highest performance with 96% accuracy. They also showed that the indicators of busyness and product price comparison had the greatest impact on increasing the online shopping trip intention.
Xiong [
8] examined consumer behavior in online shopping in the context of artificial intelligence and digital economy. The main focus of this paper is on the factors that affect the online shopping trip intention within a day. Based on the data collected by a questionnaire, the paper found that online shopping trip was prevalent among all age groups in China, with young people being the majority.
Xiahou and Harada [
9] explored the online and offline shopping behavior using machine learning techniques and longitudinal and multidimensional data variables. They proposed a churn user prediction model based on the combination of k-means customer segmentation and support vector machine (SVM). The results indicated that the online shopping trip intention was higher than the offline one. They also found that the SVM method had higher accuracy than the logistic regression method.
Lee et al. [
10] applied and compared different machine learning algorithms to predict online shopping trip conversion using 374,749 online consumer behavior data from the Google product store. They found that the ensemble model of the incremental gradient method was the most suitable method for predicting online shopping trip conversion, and that oversampling was the best method to reduce the bias of data imbalance.
Espinoza et al. [
11] examined consumer behavior in online and offline shopping trips in the context of the coronavirus pandemic. They used primary data from a structured questionnaire and an online survey to collect 200 heterogeneous types of products, and they investigated the factors that influenced people's purchase choices. They found that the respondents' skill level in using the Internet, among various technological factors, had a significant effect on their preference for the mode of shopping trip. They also found that factors such as quick product information, wider product selection, better prices and discounts influenced customers to choose online shopping trips, while faster delivery time and reliability and accuracy of product quality influenced consumers to choose offline shopping trips.
Chawla et al. [
12] used artificial neural networks to predict offline shopping trip demand for an American retail company. They developed a comparative forecasting mechanism based on ANN and ANFIS techniques to handle the trip demand forecasting problem under fuzzy conditions. They evaluated the results and showed that the ANFIS method was more effective than the ANN structure in producing more reliable forecasts for their case study.
Shi et al. [
13] proposed an approach to improve support service decision-making by predicting offline shopping trip interactions and intentions in real time using historical time series data. They analyzed real-time consumer behavior data of offline customers. They confirmed that context-aware interaction could greatly enhance consumers' shopping experience in the offline scenario. A summary of the literature review is shown in
Table 2.
Based on the literature review, we found that despite the high importance of the problem and the research that has been done in this field, most of them focused on customer behavior and demand estimation, and it seems that some aspects of this research have not received enough attention. Moreover, according to the research conducted, it seems that the choice between online and offline shopping has not been considered as a multi-criteria decision-making problem and has only been investigated and analyzed separately. Therefore, the contributions of this paper can be stated as follows:
3. Methodology
The analysis method of this research is based on machine learning methods. The feature selection in this research is done by using supervised machine learning algorithms, namely, deep network. The general procedure of the proposed method is shown as a flowchart in
Figure 1.
In the first step, we collected 500 questionnaires from online shopping trips by sending text messages and 500 questionnaires from offline shopping trips by placing in shopping centers in areas 2 and 5 of Tehran metropolis, according to the data frequency and calculations based on Cochran's formula. The statistical population of this research consisted of 1,000 active e-commerce users living in areas 2 and 5 of Tehran who had successful orders in the last 20 days of 2021 in online and offline services, and we used purposive sampling. In all these questionnaires, we used the indicators of age, gender, marital status, car ownership, delivery cost, delivery time, product price, income, employment status and level of education as the factors affecting the shopping trip. Since the deep learning method requires numerical data, we converted the values obtained from the questionnaires into quantitative values using the following
Table 3, so that they could be used as input to the deep network.
After converting the qualitative data into quantitative data, we prepared the data sets for the input to the deep neural network. Then, we performed data preprocessing and determined the deep network architecture. We prepared the data for the training and testing stage and presented the results. We sorted and labeled the data, and finally we examined the data used in
Table 4.
4. Results
The first step is to present the descriptive statistics of the statistical population.
Table 5 shows this information.
Next, we discuss the estimation of the shopping-oriented trip mode using deep learning. We use convolutional neural network as the main algorithm in this research. In the fully connected layer, we obtain the feature vector (using the activation command in MATLAB) and use it as a deep feature. We use stochastic gradient descent (SGD) algorithm to train CNN.
Table 6 shows the parameters used for the SGD algorithm. We also set the number of epochs to 40 in the network training.
One thing that can be noted about the model performance is that the neural network work is based on training and testing data. In the system preprocessing, we used 70% of the data for training and 30% for testing. On the other hand, because the system randomly selects some data for training and some others for testing in each run, the results may have a slight difference, which is insignificant. We also note that the results are based on the best outcomes after 15 runs of the neural network, which are directly related to the data selection and the system implementation. By running the CNN model, we found that this model reached an average RMSE error of 0.98 with 600 iterations for estimating the shopping trip mode. The sMAPE error rate was 0.2409.
Figure 2 shows the correlation plot of the proposed CNN network based on the distribution of training and testing data. As seen, the correlation of the results was R=0.91934.
In the next step, we compared the results obtained by the deep learning algorithm with those obtained by other models, such as K-nearest neighbor (KNN), decision tree (DT), and multi-layer perceptron (MLP) neural network.
Table 7 shows the result of this comparison.
The results in
Table 7 show that the deep network algorithm has a higher efficiency than the MLP, decision tree and KNN artificial neural network algorithms for estimating the online and offline shopping trip.
5. Conclusion
In today's economic world, having accurate and timely information is very useful for owners, investors, creditors and other interested groups to make financial decisions. With the development of technology, the use of simple models to predict customer behavior and shopping trip mode has become possible for all industries and manufacturing companies. The availability of simple and powerful tools to predict shopping trips can help owners to prevent bankruptcy and take necessary measures to improve the company's condition based on the customers' purchase or non-purchase. Predicting customer behavior and shopping trip mode is one of the most important issues in decision-making in industries, considering the effects and consequences of this phenomenon at the micro and macro levels of societies. There are various tools and models, each of which differs in the method or predictor variable. Also, it is obvious that for any type of shopping trip, whether online or offline, a logistics system must be adopted (shopping by private car or public transport for offline shopping and using a logistics fleet to deliver the product for online shopping). Depending on the type of online or offline shopping, the shopping trip mode is very important. The logistics system, which includes the transmission, movement, processing and access to logistics information for the integration of transportation, ordering and manufacturing processes, order changes, production scheduling, logistics plans and warehousing operations, is the most important part in the supply chain of companies. On the other hand, in offline shopping trips, the type of travel and the use of transportation systems by customers, in addition to economic issues, also involve environmental issues. Based on this, estimating the type of travel can be adopted as the main topic of this dissertation. Obviously, due to the large amount of data related to shopping-oriented trips that exist in both online and offline shopping, it is necessary to adopt new methods based on artificial intelligence and computing technologies. Based on this, in this research, we used machine learning techniques and specifically deep learning to evaluate the data results. In this research, considering the data frequency in areas 2 and 5 of Tehran metropolis and calculations based on Cochran's formula, we provided 1,500 questionnaires to the people of these areas. Finally, we collected 1000 questionnaires from 1000 active e-commerce users living in areas 2 and 5 of Tehran who had successful orders in online and offline services in the last 20 days of 2021. The results of the descriptive statistics of the respondents showed that the largest share of people in the statistical population were single men in the age range of 18-35 years without owning a car and having a bachelor's degree with an income level of 10-15. It was also found that most of these people were full-time employees. Also, based on the reviews conducted in the articles and consultation with experts, we used age, gender, marital status, car ownership, delivery cost, delivery time, product price, income, employment status and education level as indicators affecting the type of shopping trip. In the next step, after determining the optimal architecture of the deep network, we evaluated the results and estimated the travel mode. To compare the proposed method with other methods, we used MLP neural network, decision tree and KNN algorithms. The results showed that the deep model had the best performance with an accuracy of 95.63%. After that were neural network with 90.12% accuracy, decision tree with 86.49% accuracy and KNN model with 80.16% accuracy.
Author Contributions
Conceptualization, MH.D., A.N., and T.A.; methodology, MH.D., A.N., and T.A.; software, MH.D.; validation, MH.D., A.N., and T.A.; formal analysis, MH.D.; investigation, MH.D.; resources, MH.D., A.N., and T.A.; data curation, MH. D.; writing—original draft preparation, MH.D.; writing—review and editing, MH.D.; visualization, MH.D.; supervision, A.N.; project administration, MH.D funding acquisition, A.N. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data available on request from the authors.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Archetti, C.; & Bertazzi, L. Recent challenges in Routing and Inventory Routing: E-commerce and last-mile delivery. Networks, 2021, 77(2), 255-268. [CrossRef]
- Global ecommerce 2019. Available online: https://www.emarketer.com/content/global-ecommerce-2019, (June 27, 2019).
- Stocchi, L.; Michaelidou, N.; Pourazad, N.; Micevski, M. The rules of engagement: How to motivate consumers to engage with branded mobile apps. Journal of Marketing Management, 2018, 34(13-14), 1196-1226. [CrossRef]
- Periodic report of Urban Traffic and Transportation Organization 2020. https://www.ictte.ir/.
- World Urbanization Prospects. United Nations. New York. 2019. Archived (PDF) from the original on 11 February 2020. Retrieved 14 April 2020. https://population.un.org/wup/publications/Files/WUP2018-Highlights.pdf.
- Shao, R.; Derudder, B.; Witlox, F. The geography of e-shopping in China: On the role of physical and virtual accessibility. Journal of Retailing and Consumer Services, 2022, 64, 102753. [CrossRef]
- Dong, Y.; Tang, J.; Zhang, Z. Integrated Machine Learning Approaches for E-commerce Customer Behavior Prediction. In 2022 7th International Conference on Financial Innovation and Economic Development (ICFIED 2022) (pp. 1008-1015). Atlantis Press. [CrossRef]
- Xiong, Y. The Impact of Artificial Intelligence and Digital Economy Consumer Online Shopping Behavior on Market Changes. Discrete Dynamics in Nature and Society, 2022. [CrossRef]
- Xiahou, X.; Harada, Y. B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM. Journal of Theoretical and Applied Electronic Commerce Research, 2022, 17(2), 458-475. [CrossRef]
- Lee, R. J.; Sener, I. N.; Mokhtarian, P. L.; Handy, S. L. Relationships between the online and in-store shopping frequency of Davis, California residents. Transportation Research Part A: Policy and Practice, 2017, 100, 40-52. [CrossRef]
- Espinoza, M. C.; Ganatra, V.; Prasanth, K.; Sinha, R., Montañez, C. E. O.; Sunil, K. M.; Kaakandikar, R. Consumer behavior analysis on online and offline shopping during pandemic situation. International Journal of Accounting & Finance in Asia Pacific (IJAFAP), 2021, 4(3), 75-87. [CrossRef]
- Chawla, A.; Singh, A., Lamba, A.; Gangwani, N.; Soni, U. Demand forecasting using artificial neural networks—a case study of American Retail Corporation. In Applications of artificial intelligence techniques in engineering, 2019, 79-89. Springer, Singapore. [CrossRef]
- Shi, F.; Guegan, C. G. Adapted Decision Support Service Based on the Prediction of Offline Consumers' Real-Time Intention and Devices Interactions. 42nd Annual Computer Software and Applications Conference (COMPSAC), 2018, (Vol. 2, pp. 266-271). IEEE. [CrossRef]
- Jiang, H.; He, M.; Xi, Y.; Zeng, J. Machine-Learning-Based User Position Prediction and Behavior Analysis for Location Services. Information, 2021, 12(5), 180. [CrossRef]
- Zubaidi, S. L.; Al-Bugharbee, H.; Ortega-Martorell, S.; Gharghan, S. K.; Olier, I., Hashim, K. S.; Kot, P. A novel methodology for prediction urban water demand by wavelet denoising and adaptive neuro-fuzzy inference system approach. Water, 2020,12(6), 1628. [CrossRef]
- Punia, S.; Nikolopoulos, K.; Singh, S. P.; Madaan, J. K.; Litsiou, K. Deep learning with long short-term memory networks and random forests for demand forecasting in multi-channel retail. International journal of production research, 2020, 58(16), 4964-4979. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).