Version 1
: Received: 17 June 2024 / Approved: 18 June 2024 / Online: 19 June 2024 (02:36:15 CEST)
How to cite:
Muktar, B.; Fono, V.; Zongo, M. Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints2024, 2024061251. https://doi.org/10.20944/preprints202406.1251.v1
Muktar, B.; Fono, V.; Zongo, M. Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints 2024, 2024061251. https://doi.org/10.20944/preprints202406.1251.v1
Muktar, B.; Fono, V.; Zongo, M. Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints2024, 2024061251. https://doi.org/10.20944/preprints202406.1251.v1
APA Style
Muktar, B., Fono, V., & Zongo, M. (2024). Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints. https://doi.org/10.20944/preprints202406.1251.v1
Chicago/Turabian Style
Muktar, B., Vincent Fono and Meyo Zongo. 2024 "Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms" Preprints. https://doi.org/10.20944/preprints202406.1251.v1
Abstract
Every year, the Montreal police are confronted with countless crimes committed by criminals. These crimes affect the quality of life of city residents and impose a socio-economic burden on the city. In this study, we conduct a comparative analysis based on several machine learning algorithms to develop a model to predict the crime category in Montreal. The performance of algorithms such as eXtreme Gradient Boosting (XGBoost), Decision Trees (DT) and Random Forest (RF) were analyzed. The performance analysis takes into account the performance metrics such as precision, accuracy, recall and F1-score. This analysis was based on crime data in Montreal from 2015 to 2023. This data is characterized by a strong imbalance between crime categories. To address the data imbalance problem, a data balancing approach based on the SMOTE-ENN algorithm was adopted. In the exploratory data analysis phase, temporal trends by crime category were highlighted. The results of the analysis showed that the XGBoost algorithm outperformed the other two. Specifically, the XGBoost algorithm achieves an accuracy of 92%, while DT and RF achieve an accuracy of 86% and 84%, respectively. As a result, XGBoost was deployed via a web application using the Flask and Swagger UI Python frameworks. This study provides the Montreal police with an effective tool to better utilize their resources in fighting crime. In addition, policymakers in the city of Montreal can use this tool to identify high-risk areas and give them more attention.
Keywords
crime prediction; machine learning; XGBoost; Decision Trees; Random Forest; data imbalance; Montreal
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.