Preprint Article Version 1 This version is not peer-reviewed

Molecular Generation Strategy and Optimization Based on PPO Reinforcement Learning in De Novo Drug Design

Version 1 : Received: 7 November 2024 / Approved: 7 November 2024 / Online: 8 November 2024 (08:20:21 CET)

How to cite: Xuelan, Y.; Zain, J. M.; Tao, H.; Setyawan, G. E.; Kurnianingtyas, D.; Jamari, A. M. H. Molecular Generation Strategy and Optimization Based on PPO Reinforcement Learning in De Novo Drug Design. Preprints 2024, 2024110571. https://doi.org/10.20944/preprints202411.0571.v1 Xuelan, Y.; Zain, J. M.; Tao, H.; Setyawan, G. E.; Kurnianingtyas, D.; Jamari, A. M. H. Molecular Generation Strategy and Optimization Based on PPO Reinforcement Learning in De Novo Drug Design. Preprints 2024, 2024110571. https://doi.org/10.20944/preprints202411.0571.v1

Abstract

The drug discovery process tends to be grueling, lengthy and expensive, with cost estimates approximating $2.6 billion, consuming over 10 years to complete. Such drawbacks have set many eyes locked onto reducing the costs and accelerating the development. The emergence of Deep Reinforced Learning (DRL) within cheminformatics and bioinformatics has broadens the horizons of de novo drug design. Realizing the full potential of DRL in molecular generation requires selecting the appropriate reinforcement learning (RL) algorithm. In this work, we address these problems by utilizing Proximal Policy Optimization (PPO) algorithm within the DRL framework for molecular generation. We proposed a new method by utilizing PPO algorithm within the DRL framework, termed PSQ, to enable the generation of new chemical compounds with desired properties. This methodology has demonstrated significant potential in exploring and generating specific molecules by optimizing for targeted characteristics. The PPO algorithm's superior performance in exploring the chemical space and generating compounds with diverse pharmacophore features, functional groups, and biological activities underscores its potential in drug discovery and chemical synthesis. By systematically comparing the outputs of PPO and REINFORCE, we highlighted the robustness and efficiency of PPO in optimizing molecular properties for targeted therapeutic applications.

Keywords

drug design; deep reinforcement learning; proximal policy optimization; molecular generation; chemical space exploration; cheminformatics; bioinformatics

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.