Version 1
: Received: 8 February 2020 / Approved: 9 February 2020 / Online: 9 February 2020 (16:02:03 CET)
How to cite:
Yin, J.; Afa Michael, I.; Afa, I. J. Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints2020, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1
Yin, J.; Afa Michael, I.; Afa, I. J. Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints 2020, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1
Yin, J.; Afa Michael, I.; Afa, I. J. Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints2020, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1
APA Style
Yin, J., Afa Michael, I., & Afa, I. J. (2020). Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints. https://doi.org/10.20944/preprints202002.0108.v1
Chicago/Turabian Style
Yin, J., Inikuro Afa Michael and Iduabo John Afa. 2020 "Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data" Preprints. https://doi.org/10.20944/preprints202002.0108.v1
Abstract
Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science
machine learning; decision tree; random forest; crime data analytics
Subject
Computer Science and Mathematics, Information Systems
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Received:
16 April 2020
The commenter has declared there is no conflict of interests.
Comment:
Hello,
I am a student in forensic science and I hope it is not too late but I may have useful informations about your visualisations.
You maybe can normalise the data to compare years 2015 and 2018 to others.
Your bar graphes at Figure 3 may be improved by generating a bar for the year mean and twelve other bars for the month mean. Since you only have 4 years it should not take to much space. You may do the same type of transformation for part b. I think it's more explicit.
The figure 6 should be easier to analyse by putting each type of crime on a different figure and labeling them as Part 1, 2 and 3 of UCR category. You could also use density maps.
For figure 8 you should be able to get more useful informations by weighting the types of offences
Hope this could be as useful as I meant it would be
The commenter has declared there is no conflict of interests.
I am a student in forensic science and I hope it is not too late but I may have useful informations about your visualisations.
You maybe can normalise the data to compare years 2015 and 2018 to others.
Your bar graphes at Figure 3 may be improved by generating a bar for the year mean and twelve other bars for the month mean. Since you only have 4 years it should not take to much space. You may do the same type of transformation for part b. I think it's more explicit.
The figure 6 should be easier to analyse by putting each type of crime on a different figure and labeling them as Part 1, 2 and 3 of UCR category. You could also use density maps.
For figure 8 you should be able to get more useful informations by weighting the types of offences
Hope this could be as useful as I meant it would be
Regards