Version 1
: Received: 29 October 2024 / Approved: 30 October 2024 / Online: 30 October 2024 (12:04:46 CET)
How to cite:
Mei, L.; Xu, P. Path Planning for Robot Combined with Zero-Shot and Hierarchical Reinforcement Learning in Inexperienced Environments. Preprints2024, 2024102427. https://doi.org/10.20944/preprints202410.2427.v1
Mei, L.; Xu, P. Path Planning for Robot Combined with Zero-Shot and Hierarchical Reinforcement Learning in Inexperienced Environments. Preprints 2024, 2024102427. https://doi.org/10.20944/preprints202410.2427.v1
Mei, L.; Xu, P. Path Planning for Robot Combined with Zero-Shot and Hierarchical Reinforcement Learning in Inexperienced Environments. Preprints2024, 2024102427. https://doi.org/10.20944/preprints202410.2427.v1
APA Style
Mei, L., & Xu, P. (2024). Path Planning for Robot Combined with Zero-Shot and Hierarchical Reinforcement Learning in Inexperienced Environments. Preprints. https://doi.org/10.20944/preprints202410.2427.v1
Chicago/Turabian Style
Mei, L. and Pengjie Xu. 2024 "Path Planning for Robot Combined with Zero-Shot and Hierarchical Reinforcement Learning in Inexperienced Environments" Preprints. https://doi.org/10.20944/preprints202410.2427.v1
Abstract
Path planning for robots based on reinforcement learning encounters challenges in integrating semantic information about environments into the training process. In those unseen or complex environmental information, agents often perform sub-optimally and require more training time. In response to these challenges, this manuscript pioneers a framework integrating zero-shot learning combined with hierarchical reinforcement learning to enhance agent decision-making in complex environments. Zero-shot learning enables agents to infer correct actions for previously unseen objects or situations based on learned semantic associations. Subsequently, the path planning component utilizes hierarchical reinforcement learning with adaptive replay buffer, directed by the insights gained from zero-shot learning, to make decisions effectively. Two parts are trained separately, so zero-shot learning is available in different and unseen environments. Through simulation experiments, the proposed method proves that this structure can make full use of environmental information to generalize across unseen environments and plan collision-free paths.
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.