Preprint Article Version 1 This version is not peer-reviewed

QSAR Regression Models for Predicting HMG-CoA Reductase Inhibition Based on MACCS Molecular Fingerprints and Virtual Screening of Natural Products

Version 1 : Received: 7 October 2024 / Approved: 7 October 2024 / Online: 8 October 2024 (11:16:56 CEST)

How to cite: Ancuceanu, R.; Popovici, P. C.; Drăgănescu, D.; Busnatu, Ș.; Lascu, B. E.; Dinu, M. QSAR Regression Models for Predicting HMG-CoA Reductase Inhibition Based on MACCS Molecular Fingerprints and Virtual Screening of Natural Products. Preprints 2024, 2024100529. https://doi.org/10.20944/preprints202410.0529.v1 Ancuceanu, R.; Popovici, P. C.; Drăgănescu, D.; Busnatu, Ș.; Lascu, B. E.; Dinu, M. QSAR Regression Models for Predicting HMG-CoA Reductase Inhibition Based on MACCS Molecular Fingerprints and Virtual Screening of Natural Products. Preprints 2024, 2024100529. https://doi.org/10.20944/preprints202410.0529.v1

Abstract

HMG-CoA reductase is an enzyme that regulates the initial stage of cholesterol synthesis and its inhibitors are widely used in the treatment of cardiovascular diseases. Methods: We have created a set of quantitative structure-activity relationship (QSAR) models for human HMG-CoA reductase inhibitors using nested cross-validation as the primary validation method. To develop the QSAR models, we employed various machine learning regression algorithms, feature selection methods, and fingerprints or descriptor datasets. Results: We built and evaluated a total of 300 models, selecting 21 that demonstrated good performance (coefficient of determination, R2 ≥ 0.70 or concordance correlation coefficient, CCC ≥ 0.85). Six of these top-performing models met both performance criteria and were used to construct five ensemble models. We identified the descriptors most important in explaining HMG-CoA inhibition for each of the six best-performing models. We used the top models to search through over 220,000 chemical compounds from a large database (ZINC 15) for potential new inhibitors. Only a small fraction (237 out of approximately 220,000 compounds) had reliable predictions with mean pIC50 values ≥8 (IC50 values ≤10 nM). Our svm-based ensemble model predicted IC50 values <10 nM for roughly 0.08% of the screened compounds. We have also illustrated the potential applications of these QSAR models in understanding the cholesterol-lowering activities of herbal extracts, such as those reported for an extract prepared from the Iris × germanica rhizome. Conclusions: Our QSAR models can accurately predict human HMG-CoA reductase inhibitors, having the potential to accelerate the discovery of novel cholesterol-lowering agents and may also be applied to understand the mechanisms underlying the reported cholesterol-lowering activities of herbal extracts.

Keywords

HMG-CoA reductase; QSAR; statins; nested cross-validation; virtual screening; Iris germanica; machine learning; feature selection; mlr3; MACCS fingerprints; molecular descriptors

Subject

Medicine and Pharmacology, Pharmacology and Toxicology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.