Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

M-Learning: A Computationally Efficient Heuristic for Reinforcement Learning with Delayed Rewards

Version 1 : Received: 25 July 2024 / Approved: 29 July 2024 / Online: 29 July 2024 (08:23:17 CEST)

How to cite: Mora Cortes, M. S.; Perdomo Chary, C. A.; Perdomo, O. J. M-Learning: A Computationally Efficient Heuristic for Reinforcement Learning with Delayed Rewards. Preprints 2024, 2024072253. https://doi.org/10.20944/preprints202407.2253.v1 Mora Cortes, M. S.; Perdomo Chary, C. A.; Perdomo, O. J. M-Learning: A Computationally Efficient Heuristic for Reinforcement Learning with Delayed Rewards. Preprints 2024, 2024072253. https://doi.org/10.20944/preprints202407.2253.v1

Abstract

The current design of reinforcement learning methods demands exhaustive computing. Algorithms such as Deep Q-Network achieved outstanding results in the development of the area. However, the need for thousands of parameters and training episodes is still a problem. Thus, this document proposes a comparative analysis of the Q-Learning algorithm (the inception to create Deep Q Learning) and our proposed method termed M-Learning. The comparison among algorithms using Markov decision processes with delayed reward as a general testbench framework. Firstly, a full description of the main problems related to implementing Q-Learning, mainly about its multiple parameters. Then, the foundations of our proposed heuristic with its formulation and the whole algorithm were reported in detail. Finally, the methodology chosen to compare both algorithms was to train the algorithms in the Frozen Lake environment. The experimental results and an analysis of the best solutions found that our proposed algorithm highlights the differences in the number of episodes necessary and their standard variations. The code will be available on a GitHub repository once the paper is published.

Keywords

reinforcement learning; agents; Q-Learning; Frozen Lake; heuristic

Subject

Engineering, Control and Systems Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.