Preprint Article Version 1 This version is not peer-reviewed

Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance

Version 1 : Received: 29 October 2024 / Approved: 30 October 2024 / Online: 30 October 2024 (11:37:51 CET)

How to cite: Goldenits, G.; Neubauer, T.; Raubitzek, S.; Mallinger, K.; Weippl, E. Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance. Preprints 2024, 2024102391. https://doi.org/10.20944/preprints202410.2391.v1 Goldenits, G.; Neubauer, T.; Raubitzek, S.; Mallinger, K.; Weippl, E. Tabular Reinforcement Learning for Reward Robust, Explainable Crop Rotation Policies Matching Deep Reinforcement Learning Performance. Preprints 2024, 2024102391. https://doi.org/10.20944/preprints202410.2391.v1

Abstract

Digital Twins’ design frequently incorporates machine learning and, more recently, deep reinforcement learning techniques in order to interpret data and forecast future outcomes based on incoming data. However, because neural networks are typically considered a ”black box” model, doubts over the reliability of the output from deep learning models continue. In our work, we developed crop rotation policies using explainable tabular reinforcement learning techniques. We created five-step rotations, or a succession of five successive crops, to compare these policies to those produced by a deep Q-learning technique. The purpose of the rotations is to maximize crop yields while upholding prescribed planting guidelines and preserving a healthy amount of nitrogen in the soil. We incorporated phenomena related to weather and price fluctuations into the reward signal to account for the potential impact of external factors on crop production. The deployed explainable tabular reinforcement learning methods outperform the deep Q-learning approach regarding collected rewards when the rewards are not perturbed. For the perturbed case, robust tabular reinforcement learning methods outperform the deep learning approach even more while maintaining human interpretable policies. By consulting with farmers and crop rotation experts, we demonstrate that the derived policies are reasonable to use and more resilient towards external perturbations. Furthermore, the use of interpretable and explainable reinforcement learning techniques increases confidence in resulting policies, thereby increasing the likelihood that farmers will adopt the suggested policies. Digital Twins are often intertwined with machine learning and, more recently, deep reinforcement learning methods in their architecture to process data and predict future outcomes based on input data. However, concerns about the trustworthiness of the output from deep learning models persist due to neural networks generally being regarded as a black box model. In our work, we developed crop rotation policies using explainable tabular reinforcement learning techniques. We compared these policies to those generated by a deep Q-learning approach by generating five-step rotations, i.e. producing a series of five consecutive crops. The aim of the rotations is to maximise crop yields while maintaining a healthy nitrogen level in the soil and adhering to established planting rules. Crop yields may vary due to external factors such as weather patterns or changes in market prices, so perturbations have been added to the reward signal to account for those influences. The deployed explainable tabular reinforcement learning methods collect, on average, at least as much reward over 100 crop rotation plans when randomly starting with any crop compared to the deep learning model. For the perturbed case, robust tabular reinforcement learning methods collect similar amounts of reward across 100 crop rotation plans compared to the non-random reward setting, whereas the deep reinforcement learning agent collects even fewer rewards compared to learning on non-perturbed rewards. By consulting with farmers and crop rotation experts, we demonstrate that the derived policies are reasonable to use and more resilient towards external perturbations. Furthermore, the use of interpretable and explainable reinforcement learning techniques increases confidence in resulting policies, thereby increasing the likelihood that farmers will adopt the suggested policies.

Keywords

Crop Rotation Planning; Digital Twin; Explainable AI; Reinforcement Learning; Sustainable Agriculture

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.