Enhancing Quantitative Structure–Activity Relationship Predictive Power and Explainability: Meta-Modeling and Shapley Additive Explanations Feature Importance Analysis for Drug Discovery

Penulis: Sanjaya, ArdoRatnawati, HanaMianto, Nathanael A.Camillo, Keyshia V.Tedjo, Aryo
Informasi
JurnalTropical Journal of Natural Product Research
PenerbitFaculty of Pharmacy, University of Benin
Volume & EdisiVol. 9,Edisi 8
Halaman3784 - 3793
Tahun Publikasi2025
ISSN26160684
Jenis SumberScopus
Abstrak
Quantitative structure-activity relationship (QSAR) modeling plays a crucial role in drug discovery by predicting biological endpoints based on molecular structure. Existing studies lack consensus on the optimal fingerprint, and many function as black boxes with limited explainability. This study addresses these gaps by integrating multiple fingerprints through meta-modeling and applying SHAP analysis to enhance prediction accuracy and interpretability. The performance of several fingerprints was evaluated across ten proteins to predict pIC50. Concordance analysis using the Concordance Correlation Coefficient (CCC) was used to evaluate prediction agreement and reproducibility. Shapley Additive exPlanations (SHAP) analysis was used to analyze the feature importance of the molecular substructure in the base model and fingerprint importance in the meta-model. A streamlit web application was developed to demonstrate prediction and feature importance visualization. No fingerprints showed superiority over the others. Concordance analysis showed high agreement with CCC values above 0.98, reflecting high prediction reproducibility. Meta models combining Morgan6 and other fingerprints outperformed individual models in 7 protein targets. SHAP analysis revealed that fingerprint importance is context-dependent on the target proteins. The web application demonstrated the importance of identifying critical substructures using a case study in Donepezil. Although fingerprints modeled different aspects of the molecule, they have similar performance. The fingerprints showed high predictive reproducibility and agreement. This study demonstrates that while individual molecular fingerprints offer comparable predictive performance, integrating them through meta-modeling enhances prediction accuracy and interpretability. Combined with SHAP-based fingerprint importance analysis, the study provides a reproducible and explainable QSAR framework that advances data-driven drug discovery. © 2025 Sanjaya et al.
Dokumen & Tautan

© 2025 Universitas Indonesia. Seluruh hak cipta dilindungi.