2020 journal article
Cheminformatics Analysis and Modeling with MacrolactoneDB
SCIENTIFIC REPORTS, 10(1).
AbstractMacrolactones, macrocyclic lactones with at least twelve atoms within the core ring, include diverse natural products such as macrolides with potent bioactivities (e.g. antibiotics) and useful drug-like characteristics. We have developed MacrolactoneDB, which integrates nearly 14,000 existing macrolactones and their bioactivity information from different public databases, and new molecular descriptors to better characterize macrolide structures. The chemical distribution of MacrolactoneDB was analyzed in terms of important molecular properties and we have utilized three targets of interest (Plasmodium falciparum, Hepatitis C virus and T-cells) to demonstrate the value of compiling this data. Regression machine learning models were generated to predict biological endpoints using seven molecular descriptor sets and eight machine learning algorithms. Our results show that merging descriptors yields the best predictive power with Random Forest models, often boosted by consensus or hybrid modeling approaches. Our study provides cheminformatics insights into this privileged, underexplored structural class of compounds with high therapeutic potential.