Article
Version 1
Preserved in Portico This version is not peer-reviewed
Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques
Version 1
: Received: 9 May 2023 / Approved: 10 May 2023 / Online: 10 May 2023 (09:34:36 CEST)
A peer-reviewed article of this Preprint also exists.
Eze, P.U.; Geard, N.; Mueller, I.; Chades, I. Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques. Healthcare 2023, 11, 1896. Eze, P.U.; Geard, N.; Mueller, I.; Chades, I. Anomaly Detection in Endemic Disease Surveillance Data Using Machine Learning Techniques. Healthcare 2023, 11, 1896.
Abstract
Disease surveillance is critical to monitor ongoing control activities, detect early outbreaks and to inform intervention priorities and policies. Unfortunately, most data from disease surveillance remain under-utilised to support decision-making in real-time. Using the Brazilian Amazon malaria surveillance data as a case study, we explore unsupervised anomaly detection machine learning techniques to analyse and discover potential anomalies. We found that our models are able to detect early outbreaks, peak of outbreaks as well as change points in the proportion of positive malaria cases. Specifically, the sustained rise in malaria in the Brazilian Amazon in 2016 was flagged by several models. We also found that no single model detects all the anomalies across all health regions. The approaches using Clustering-based local outlier algorithm ranked first before Principal component analysis and Stochastic outlier selection in maximising the number of anomalies detected in local health regions. Because of this, we also provide the minimum number of machine learning models (top-k models) to maximise the number of anomalies detected across different health regions. We discovered that the top-3 models that maximise the coverage of the number and types of anomalies detected across the 13 health regions are: Principal component analysis, Stochastic outlier selection and Multi-covariance determinant. Anomaly detection approaches provide interesting solutions to discover patterns of epidemiological importance when confronted with a large volume of data across space and time. Our exploratory approach can be replicated for other diseases and locations to inform timely interventions and actions toward endemic disease control.
Keywords
Anomaly detection; Malaria data; Machine learning; big data; epidemic
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment