Version 1
: Received: 13 June 2024 / Approved: 13 June 2024 / Online: 13 June 2024 (23:49:00 CEST)
How to cite:
Okada, H. Evolutionary Reinforcement Learning of Binary Neural Network Controllers for Pendulum Task—Part2: Genetic Algorithm. Preprints2024, 2024060933. https://doi.org/10.20944/preprints202406.0933.v1
Okada, H. Evolutionary Reinforcement Learning of Binary Neural Network Controllers for Pendulum Task—Part2: Genetic Algorithm. Preprints 2024, 2024060933. https://doi.org/10.20944/preprints202406.0933.v1
Okada, H. Evolutionary Reinforcement Learning of Binary Neural Network Controllers for Pendulum Task—Part2: Genetic Algorithm. Preprints2024, 2024060933. https://doi.org/10.20944/preprints202406.0933.v1
APA Style
Okada, H. (2024). Evolutionary Reinforcement Learning of Binary Neural Network Controllers for Pendulum Task—Part2: Genetic Algorithm. Preprints. https://doi.org/10.20944/preprints202406.0933.v1
Chicago/Turabian Style
Okada, H. 2024 "Evolutionary Reinforcement Learning of Binary Neural Network Controllers for Pendulum Task—Part2: Genetic Algorithm" Preprints. https://doi.org/10.20944/preprints202406.0933.v1
Abstract
Evolutionary algorithms and swarm intelligence algorithms find applicability in reinforcement learning of neural networks due to their independence from gradient-based methods. To achieve successful training of neural networks using these algorithms, careful considerations must be made to select appropriate algorithms due to the availability of various algorithmic variations. In Part1, the author previously reported experimental evaluations on Evolution Strategy for reinforcement learning of binary neural networks, utilizing the Pendulum control task. This article constitutes Part2 of the series of comparative research. In this study, Genetic Algorithm is adopted as another evolutionary algorithm. Wilcoxon signed rank test revealed that there was no statistically significant difference between the fitness scores obtained using GA and those using ES. However, the p-value 0.11 indicated that GA worked better than ES on this training task. As the values of binary weights, {-1, 1} were significantly superior to {0, 1} (p < .01). The motion of the pendulum controlled by the binary MLP after the training showed that the binary MLP successfully swung the pendulum swiftly into an inverted position and maintained its stability after inversion.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.