Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms

Version 1 : Received: 17 June 2024 / Approved: 18 June 2024 / Online: 19 June 2024 (02:36:15 CEST)

How to cite: Muktar, B.; Fono, V.; Zongo, M. Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints 2024, 2024061251. https://doi.org/10.20944/preprints202406.1251.v1 Muktar, B.; Fono, V.; Zongo, M. Predicting Crime Categories in Montreal: A Comparative Analysis of Machine Learning Algorithms. Preprints 2024, 2024061251. https://doi.org/10.20944/preprints202406.1251.v1

Abstract

Every year, the Montreal police are confronted with countless crimes committed by criminals. These crimes affect the quality of life of city residents and impose a socio-economic burden on the city. In this study, we conduct a comparative analysis based on several machine learning algorithms to develop a model to predict the crime category in Montreal. The performance of algorithms such as eXtreme Gradient Boosting (XGBoost), Decision Trees (DT) and Random Forest (RF) were analyzed. The performance analysis takes into account the performance metrics such as precision, accuracy, recall and F1-score. This analysis was based on crime data in Montreal from 2015 to 2023. This data is characterized by a strong imbalance between crime categories. To address the data imbalance problem, a data balancing approach based on the SMOTE-ENN algorithm was adopted. In the exploratory data analysis phase, temporal trends by crime category were highlighted. The results of the analysis showed that the XGBoost algorithm outperformed the other two. Specifically, the XGBoost algorithm achieves an accuracy of 92%, while DT and RF achieve an accuracy of 86% and 84%, respectively. As a result, XGBoost was deployed via a web application using the Flask and Swagger UI Python frameworks. This study provides the Montreal police with an effective tool to better utilize their resources in fighting crime. In addition, policymakers in the city of Montreal can use this tool to identify high-risk areas and give them more attention.

Keywords

crime prediction; machine learning; XGBoost; Decision Trees; Random Forest; data imbalance; Montreal

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.