Version 1
: Received: 15 July 2024 / Approved: 16 July 2024 / Online: 16 July 2024 (08:29:08 CEST)
How to cite:
Capel, M. I.; Salguero-Hidalgo, A.; Holgado-Terriza, J. A. Parallel PSO for the Efficient Training of Neural Networks Using the GPGPU and Apache Spark in an Edge Computing Environment. Preprints2024, 2024071300. https://doi.org/10.20944/preprints202407.1300.v1
Capel, M. I.; Salguero-Hidalgo, A.; Holgado-Terriza, J. A. Parallel PSO for the Efficient Training of Neural Networks Using the GPGPU and Apache Spark in an Edge Computing Environment. Preprints 2024, 2024071300. https://doi.org/10.20944/preprints202407.1300.v1
Capel, M. I.; Salguero-Hidalgo, A.; Holgado-Terriza, J. A. Parallel PSO for the Efficient Training of Neural Networks Using the GPGPU and Apache Spark in an Edge Computing Environment. Preprints2024, 2024071300. https://doi.org/10.20944/preprints202407.1300.v1
APA Style
Capel, M. I., Salguero-Hidalgo, A., & Holgado-Terriza, J. A. (2024). Parallel PSO for the Efficient Training of Neural Networks Using the GPGPU and Apache Spark in an Edge Computing Environment. Preprints. https://doi.org/10.20944/preprints202407.1300.v1
Chicago/Turabian Style
Capel, M. I., Alberto Salguero-Hidalgo and Juan Antonio Holgado-Terriza. 2024 "Parallel PSO for the Efficient Training of Neural Networks Using the GPGPU and Apache Spark in an Edge Computing Environment" Preprints. https://doi.org/10.20944/preprints202407.1300.v1
Abstract
Deep learning neural networks require an immense amount of computation, especially in the training phase of the network when networks with multiple layers of intermediate neurons need to be built. In this paper, we will focus on the PSO algorithm with the aim of significantly accelerating the DLNN training phase by taking advantage of the GPGPU architecture and the Apache Spark analytics engine for large-scale data processing tasks. PSO is a bio-inspired stochastic optimization method whose goal is to iteratively improve the solution to a (usually complex) problem by attempting to approximate a given objective. However, parallelizing an efficient PSO is not a straightforward process due to the complexity of the computations performed on the swarm of particles and the iterative execution of the algorithm until a solution close to the objective with minimal error is achieved. In the present work, two parallelizations of the PSO algorithm have been implemented , both designed for a distributed execution environment. The synchronous parallel PSO implementation ensures consistency at the cost of potential idle time due to global synchronization, while the asynchronous parallel PSO approach improves execution time by reducing the need for global synchronization, making it more suitable for large datasets and distributed environments such as Apache Spark. Both variants of PSO have been implemented to distribute the computational load supported by this algorithm –due to the costly fitness evaluation and updating of particle positions– across the different Spark cluster executor nodes to effectively achieve coarse-grained parallelism, resulting in a significant performance increase over current sequential variants of PSO.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.