Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Enhanced Particle Classification in Water Cherenkov Detectors Using Machine Learning: Modeling and Validation With Monte Carlo Simulation Datasets

Version 1 : Received: 24 June 2024 / Approved: 25 June 2024 / Online: 27 June 2024 (06:06:17 CEST)

How to cite: Torres Peralta, T.; Molina, M. G.; Asorey, H.; Sidelnik, I.; Rubio-Montero, A. J.; Dasso, S.; Mayo-García, R.; Taboada, A.; Otiniano, L.; Collaboration, L. Enhanced Particle Classification in Water Cherenkov Detectors Using Machine Learning: Modeling and Validation With Monte Carlo Simulation Datasets. Preprints 2024, 2024061870. https://doi.org/10.20944/preprints202406.1870.v1 Torres Peralta, T.; Molina, M. G.; Asorey, H.; Sidelnik, I.; Rubio-Montero, A. J.; Dasso, S.; Mayo-García, R.; Taboada, A.; Otiniano, L.; Collaboration, L. Enhanced Particle Classification in Water Cherenkov Detectors Using Machine Learning: Modeling and Validation With Monte Carlo Simulation Datasets. Preprints 2024, 2024061870. https://doi.org/10.20944/preprints202406.1870.v1

Abstract

The Latin American Giant Observatory (LAGO) is a ground-based extended cosmic rays observatory designed to study transient astrophysical events, the role of the atmosphere on the formation of secondary particles, and space-weather-related phenomena. With the use of a network of Water Cherenkov Detectors, LAGO measures the secondary particle flux, a consequence of the interaction of astroparticles impinging on the Earth with the atmosphere. These interactions yield a flux of secondary particles that can be grouped into three distinct basic constituents: the electromagnetic, muonic, and hadronic components. In this work we extend our previous research by using detailed simulations of the expected atmospheric response to the primary flux and the corresponding response of our WCDs to atmospheric radiation. After implemented OPTICS, a density-based clustering algorithm, to identify patterns in data collected by a single WCD, we have further refined our approach by implementing a method that categorizes and differentiates particle groups using advanced unsupervised machine learning techniques. Our analysis demonstrates that applying our enhanced methodology can accurately identify the originating particle with a high degree of confidence on a single-pulse basis, highlighting its precision and reliability. These promising results suggest the feasibility of implementing machine-learning-based models throughout LAGO distributed detection network and other astroparticles observatories for semi-automated, onboard and real-time data analysis. When a particle enters a Water Cherenkov Detector (WCD), it generates a measurable signal characterized by unique features correlating to the particle’s type and the detector’s specific response. The resulting charge histograms from these signals provide valuable insights into the flux of primary astroparticles and their key characteristics. However, this data is insufficient to effectively distinguish between the contributions of different secondary particles. This research extends our previous investigations, wherein we implemented OPTICS, a density-based clustering algorithm, to identify patterns in data collected by a single WCD. We have further refined our approach by implementing a method that categorizes and differentiates particle groups using advanced unsupervised machine learning techniques. This methodology facilitates the differentiation among particle types and utilizes the detector’s nuanced response to each particle type, thus pinpointing the principal contributors within each group. In this work, we utilize OPTICS to analyze simulated data, further validating the proposed method. We created a simulated dataset by combining the outputs of the ARTI and MEIGA frameworks. This dataset simulates the expected WCD signals produced by the flux of secondary particles during one day at the LAGO site in Bariloche, Argentina, situated at 865 meters above sea level. To achieve this, real-time magnetospheric and local atmospheric conditions for February and March of 2012 were analyzed, and the resultant atmospheric secondary-particle flux was integrated into a specific MEIGA application featuring a comprehensive Geant4 model of the WCD at this LAGO location. The output from MEIGA was modified for effective integration into our machine learning pipeline. Our analysis demonstrates that applying our enhanced methodology can accurately identify the originating particle with a high degree of confidence on a single-pulse basis, highlighting its precision and reliability. These promising results suggest the feasibility of implementing machine-learning-based models throughout the LAGO distributed detection network for semi-automated, onboard and real-time data analysis. Our analysis demonstrates that OPTICS can accurately identify the originating particle with a high degree of confidence on a single-pulse basis, highlighting its precision and reliability. These promising results suggest the feasibility of implementing machine-learning-based models throughout the LAGO distributed detection network for semi-automated, onboard data analysis.

Keywords

Machine Learning; clustering; OPTICS; Water Cherenkov Detector; Astroparticle Detectors; Cosmic Rays; Astroparticles

Subject

Environmental and Earth Sciences, Atmospheric Science and Meteorology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.