Goldenits, G.; Neubauer, T.; Raubitzek, S.; Mallinger, K.; Weippl, E. Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance. Preprints2024, 2024102391. https://doi.org/10.20944/preprints202410.2391.v1
APA Style
Goldenits, G., Neubauer, T., Raubitzek, S., Mallinger, K., & Weippl, E. (2024). Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance. Preprints. https://doi.org/10.20944/preprints202410.2391.v1
Chicago/Turabian Style
Goldenits, G., Kevin Mallinger and Edgar Weippl. 2024 "Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance" Preprints. https://doi.org/10.20944/preprints202410.2391.v1
Abstract
Digital Twins’ design frequently incorporates machine learning and, more recently, deep reinforcement learning techniques in order to interpret data and forecast future outcomes based on incoming data. However, because neural networks are typically considered a ”black box” model, doubts over the reliability of the output from deep learning models continue. In our work, we developed crop rotation policies using explainable tabular reinforcement learning techniques. We created five-step rotations, or a succession of five successive crops, to compare these policies to those produced by a deep Q-learning technique. The purpose of the rotations is to maximize crop yields while upholding prescribed planting guidelines and preserving a healthy amount of nitrogen in the soil. We incorporated phenomena related to weather and price fluctuations into the reward signal to account for the potential impact of external factors on crop production. The deployed explainable tabular reinforcement learning methods outperform the deep Q-learning approach regarding collected rewards when the rewards are not perturbed. For the perturbed case, robust tabular reinforcement learning methods outperform the deep learning approach even more while maintaining human interpretable policies. By consulting with farmers and crop rotation experts, we demonstrate that the derived policies are reasonable to use and more resilient towards external perturbations. Furthermore, the use of interpretable and explainable reinforcement learning techniques increases confidence in resulting policies, thereby increasing the likelihood that farmers will adopt the suggested policies. Digital Twins are often intertwined with machine learning and, more recently, deep reinforcement learning methods in their architecture to process data and predict future outcomes based on input data. However, concerns about the trustworthiness of the output from deep learning models persist due to neural networks generally being regarded as a black box model. In our work, we developed crop rotation policies using explainable tabular reinforcement learning techniques. We compared these policies to those generated by a deep Q-learning approach by generating five-step rotations, i.e. producing a series of five consecutive crops. The aim of the rotations is to maximise crop yields while maintaining a healthy nitrogen level in the soil and adhering to established planting rules. Crop yields may vary due to external factors such as weather patterns or changes in market prices, so perturbations have been added to the reward signal to account for those influences. The deployed explainable tabular reinforcement learning methods collect, on average, at least as much reward over 100 crop rotation plans when randomly starting with any crop compared to the deep learning model. For the perturbed case, robust tabular reinforcement learning methods collect similar amounts of reward across 100 crop rotation plans compared to the non-random reward setting, whereas the deep reinforcement learning agent collects even fewer rewards compared to learning on non-perturbed rewards. By consulting with farmers and crop rotation experts, we demonstrate that the derived policies are reasonable to use and more resilient towards external perturbations. Furthermore, the use of interpretable and explainable reinforcement learning techniques increases confidence in resulting policies, thereby increasing the likelihood that farmers will adopt the suggested policies.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.