Interpretable Machine Learning Framework for Predicting Treatment Resistance in Psychiatric Disorders Using Synthetic Pharmacogenomic and Clinical Data
This study presents a comprehensive and interpretable machine learning pipeline for predicting treatment resistance in psychiatric disorders using synthetically generated, multimodal data. The simulated dataset integrates gene expression profiles and clinically relevant features, such as age of onset, BMI, comorbidities, treatment response scores, and disease progression markers, thereby mimicking real-world pharmacogenomic and clinical complexities. The primary objective is to assess the feasibility of using artificial intelligence for individualized treatment stratification in psychiatry, with a strong emphasis on transparency and clinical relevance. Three classifiers—Random Forest, Gradient Boosting, and Calibrated Support Vector Machine (SVM)—were trained and evaluated using a balanced dataset of 2000 synthetic patients. Rigorous model validation was performed using key metrics, including ROC-AUC, F1-score, and balanced accuracy. Random Forest achieved the highest ROC-AUC (0.80) and balanced accuracy (0.72), followed closely by Gradient Boosting. Calibrated SVM exhibited lower performance but added methodological diversity. Feature importance was assessed using both traditional methods and permutation-based analysis, consistently high-lighting GENE_1, GENE_6, Rapid_Cycling, and Lithium_Response as dominant predictors. Local model explainability was further enhanced using LIME, which provided individualized insights into each prediction. A full visual clinical report was generated for a sample patient, including gene expression summaries, predicted drug responses, and actionable recommendations. The pipeline demonstrates a reproducible and explainable framework for AI-driven clinical decision support in psychiatry. By bridging genetic markers, treatment outcomes, and machine learning interpretability, this study offers a template for precision psychiatry using realistic simulation models. The findings reinforce the value of integrating interpretability into ML models to promote trust and applicability in clinical practice.
Cite this paper
Filippis, R. D. and Foysal, A. A. (2025). Interpretable Machine Learning Framework for Predicting Treatment Resistance in Psychiatric Disorders Using Synthetic Pharmacogenomic and Clinical Data. Open Access Library Journal, 12, e13870. doi: http://dx.doi.org/10.4236/oalib.1113870.
Zhdanava, M., Pilon, D., Ghelerter, I., Chow, W., Joshi, K., Lefebvre, P., et al. (2021) The Prevalence and National Burden of Treatment-Resistant Depression and Major Depressive Disorder in the United States. The Journal of Clinical Psychiatry, 82, 20m13699. https://doi.org/10.4088/jcp.20m13699
Baig-Ward, K.M., Jha, M.K. and Trivedi, M.H. (2023) The Individual and Societal Burden of Treat-ment-Resistant Depression: An Overview. Psychiatric Clinics of North America, 46, 211-226. https://doi.org/10.1016/j.psc.2023.02.001
Sousa, R.D., Gouveia, M., Nunes da Silva, C., Rodrigues, A.M., Cardoso, G., Antunes, A.F., et al. (2022) Treatment-resistant Depression and Major Depression with Suicide Risk—The Cost of Illness and Burden of Disease. Frontiers in Public Health, 10, Article 898491. https://doi.org/10.3389/fpubh.2022.898491
Al Hamid, A., Ghaleb, M., Aljadhey, H. and Aslanpour, Z. (2014) A Systematic Review of Hospitalization Resulting from Medicine‐related Problems in Adult Patients. British Journal of Clinical Pharmacology, 78, 202-217. https://doi.org/10.1111/bcp.12293
Howes, O.D., Thase, M.E. and Pillinger, T. (2021) Treatment Re-sistance in Psychiatry: State of the Art and New Directions. Molecular Psychia-try, 27, 58-72. https://doi.org/10.1038/s41380-021-01200-3
Kane, J.M., Agid, O., Baldwin, M.L., Howes, O., Lindenmayer, J., Marder, S., et al. (2019) Clinical Guidance on the Identification and Management of Treat-ment-Resistant Schizophrenia. The Journal of Clinical Psychiatry, 80, 18com12123. https://doi.org/10.4088/jcp.18com12123
Dodd, S., Bauer, M., Carvalho, A.F., Eyre, H., Fava, M., Kasper, S., et al. (2020) A Clinical Approach to Treatment Resistance in Depressed Patients: What to Do When the Usual Treatments Don’t Work Well Enough? The World Journal of Biological Psychia-try, 22, 483-494. https://doi.org/10.1080/15622975.2020.1851052
Vadapalli, S., Ab-delhalim, H., Zeeshan, S. and Ahmed, Z. (2022) Artificial Intelligence and Ma-chine Learning Approaches Using Gene Expression and Variant Data for Per-sonalized Medicine. Briefings in Bioinformatics, 23, bbac191. https://doi.org/10.1093/bib/bbac191
Taherdoost, H. and Ghofrani, A. (2024) AI’s Role in Revolutionizing Personalized Medicine by Reshaping Phar-macogenomics and Drug Therapy. Intelligent Pharmacy, 2, 643-650.
Feng, F., Shen, B., Mou, X., Li, Y. and Li, H. (2021) Large-Scale Pharmacogenomic Studies and Drug Response Prediction for Personalized Can-cer Medicine. Journal of Genetics and Genomics, 48, 540-551. https://doi.org/10.1016/j.jgg.2021.03.007
Martínez-García, M. and Hernández-Lemus, E. (2022) Data Integration Challenges for Machine Learning in Precision Medicine. Frontiers in Medicine, 8, Article 784455. https://doi.org/10.3389/fmed.2021.784455
Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A. and Hoffman, M.M. (2019) Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities. Information Fusion, 50, 71-91. https://doi.org/10.1016/j.inffus.2018.09.012
Muller, H. and Unay, D. (2017) Retrieval from and Understanding of Large-Scale Multi-Modal Medical Datasets: A Review. IEEE Transactions on Multimedia, 19, 2093-2104. https://doi.org/10.1109/tmm.2017.2729400
Santos, D. and Barbeiro, R.B. (2024) Improving the Robustness of Multimodal AI with Asynchronous and Missing Inputs. Ph.D. Thesis, NOVA University Lisbon.
Kavyashree, N., Surekha, R., Arya, A., Phadke, M.M., Khan, M.A. and Khetre, R.D. (2024) ML and AI Challenges and Applications in Healthcare. African Journal of Biological Sciences, 6, 509-527.
Ennab, M. and Mcheick, H. (2024) Enhancing In-terpretability and Accuracy of AI Models in Healthcare: A Comprehensive Re-view on Challenges and Future Directions. Frontiers in Robotics and AI, 11, Ar-ticle 1444763. https://doi.org/10.3389/frobt.2024.1444763
Petersen, E., Potdevin, Y., Mohammadi, E., Zidowitz, S., Breyer, S., Nowotka, D., et al. (2022) Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and Solutions. IEEE Access, 10, 58375-58418. https://doi.org/10.1109/access.2022.3178382
Mienye, I.D., Obaido, G., Jere, N., Mienye, E., Aruleba, K., Emmanuel, I.D., et al. (2024) A Survey of Ex-plainable Artificial Intelligence in Healthcare: Concepts, Applications, and Chal-lenges. Informatics in Medicine Unlocked, 51, Article 101587. https://doi.org/10.1016/j.imu.2024.101587
Tursunalieva, A., Alexan-der, D.L.J., Dunne, R., Li, J., Riera, L. and Zhao, Y. (2024) Making Sense of Ma-chine Learning: A Review of Interpretation Techniques and Their Applications. Applied Sciences, 14, Article 496. https://doi.org/10.3390/app14020496
Valente, F., Paredes, S., Hen-riques, J., Rocha, T., de Carvalho, P. and Morais, J. (2022) Interpretability, Per-sonalization and Reliability of a Machine Learning Based Clinical Decision Sup-port System. Data Mining and Knowledge Discovery, 36, 1140-1173. https://doi.org/10.1007/s10618-022-00821-8
Velligan, D.I., Weiden, P.J., Sajatovic, M., Scott, J., Carpenter, D., Ross, R., et al. (2010) Strategies for Addressing Adherence Problems in Patients with Serious and Persistent Mental Illness: Recommendations from the Expert Consensus Guidelines. Journal of Psychiatric Practice, 16, 306-324. https://doi.org/10.1097/01.pra.0000388626.98662.a0
Pompili, M. and Fiorillo, A. (2017) Editorial: Unmet Needs in Modern Psychiatry. CNS & Neuro-logical Disorders-Drug Targets, 16, 857-857. https://doi.org/10.2174/187152731608180119110219
Voineskos, D., Daskalakis, Z.J. and Blumberger, D.M. (2020) Management of Treat-ment-Resistant Depression: Challenges and Strategies. Neuropsychiatric Disease and Treatment, 16, 221-234. https://doi.org/10.2147/ndt.s198774
Eichler, H., Abadie, E., Breckenridge, A., Flamion, B., Gustafsson, L.L., Leufkens, H., et al. (2011) Bridging the Efficacy-Effectiveness Gap: A Regulator’s Perspective on Address-ing Variability of Drug Response. Nature Reviews Drug Discovery, 10, 495-506. https://doi.org/10.1038/nrd3501
Kent, D.M., Steyerberg, E. and van Klaveren, D. (2018) Personalized Evidence Based Medicine: Predictive Approaches to Heterogeneous Treatment Effects. BMJ, 363, k4245. https://doi.org/10.1136/bmj.k4245
Roukos, D.H., Murray, S. and Bria-soulis, E. (2007) Molecular Genetic Tools Shape a Roadmap Towards a More Accurate Prognostic Prediction and Personalized Management of Cancer. Can-cer Biology & Therapy, 6, 308-312. https://doi.org/10.4161/cbt.6.3.3994
Nevins, J.R., Huang, E.S., Dress-man, H., Pittman, J., Huang, A.T. and West, M. (2003) Towards Integrated Clinico-Genomic Models for Personalized Medicine: Combining Gene Expression Signatures and Clinical Factors in Breast Cancer Outcomes Prediction. Human Molecular Genetics, 12, R153-R157. https://doi.org/10.1093/hmg/ddg287
Lambin, P., Leijenaar, R.T.H., Deist, T.M., Peerlings, J., de Jong, E.E.C., van Timmeren, J., et al. (2017) Radiomics: The Bridge between Medical Imaging and Personalized Medicine. Nature Reviews Clinical Oncology, 14, 749-762. https://doi.org/10.1038/nrclinonc.2017.141
Miotto, R., Wang, F., Wang, S., Jiang, X. and Dudley, J.T. (2017) Deep Learning for Healthcare: Review, Op-portunities and Challenges. Briefings in Bioinformatics, 19, 1236-1246. https://doi.org/10.1093/bib/bbx044
Thieme, A., Belgrave, D. and Doherty, G. (2020) Machine Learning in Mental Health: A Systematic Review of the HCI Literature to Support the Development of Effective and Implementable ML Systems. ACM Transactions on Computer-Human Interaction, 27, 1-53. https://doi.org/10.1145/3398069
Alicioglu, G. and Sun, B. (2022) A Survey of Visual Analytics for Explainable Artificial Intelligence Methods. Com-puters & Graphics, 102, 502-520. https://doi.org/10.1016/j.cag.2021.09.002
Ghassemi, M., Naumann, T., Schulam, P., Beam, A.L., Chen, I.Y. and Ranganath, R. (2020) A Review of Challenges and Opportunities in Machine Learning for Health. AMIA Summits on Translational Science Proceedings, 2020, 191-200.
Ching, T., Himmelstein, D.S., Beaulieu-Jones, B.K., Kalinin, A.A., Do, B.T., Way, G.P., et al. (2018) Opportunities and Obstacles for Deep Learning in Biology and Medicine. Journal of the Royal Society Interface, 15, Article ID: 20170387. https://doi.org/10.1098/rsif.2017.0387
Achuthan, S., Chatterjee, R., Kotnala, S., Mohanty, A., Bhattacharya, S., Salgia, R., et al. (2022) Leveraging Deep Learning Algorithms for Synthetic Data Generation to Design and Analyze Biological Networks. Journal of Biosciences, 47, Article No. 43. https://doi.org/10.1007/s12038-022-00278-3
van Breugel, B., Liu, T., Oglic, D. and van der Schaar, M. (2024) Synthetic Data in Biomedicine via Gen-erative Artificial Intelligence. Nature Reviews Bioengineering, 2, 991-1004. https://doi.org/10.1038/s44222-024-00245-7
Ang, J.C., Mirzal, A., Haron, H. and Hamed, H.N.A. (2016) Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 13, 971-989. https://doi.org/10.1109/tcbb.2015.2478454
Alelyani, S., Tang, J. and Liu, H. (2018) Feature Selection for Clustering: A Review. In: Aggarwal, C.C. and Reddy, C.K., Eds., Data Clustering, Chapman and Hall/CRC, 29-60. https://doi.org/10.1201/9781315373515-2
Ozcift, A. and Gulten, A. (2011) Classifier Ensemble Construction with Rotation Forest to Improve Med-ical Diagnosis Performance of Machine Learning Algorithms. Computer Methods and Programs in Biomedicine, 104, 443-451. https://doi.org/10.1016/j.cmpb.2011.03.018
Mohammadagha, M. (2025) Hyperparameter Optimization Strategies for Tree-Based Machine Learning Models Prediction: A Comparative Study of Ada-Boost, Decision Trees, and Random Forest. Decision Trees, and Random Forest.
Neto, M.P. and Paulovich, F.V. (2021) Explainable Matrix—Visualization for Global and Local Interpretability of Random Forest Classification Ensembles. IEEE Transactions on Visualization and Computer Graphics, 27, 1427-1437. https://doi.org/10.1109/tvcg.2020.3030354
Marchese Robinson, R.L., Palczewska, A., Palczewski, J. and Kidley, N. (2017) Comparison of the Predic-tive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets. Journal of Chemical Information and Modeling, 57, 1773-1792. https://doi.org/10.1021/acs.jcim.6b00753
Kim, H., Pang, S., Je, H., Kim, D. and Yang Bang, S. (2003) Constructing Support Vector Machine Ensemble. Pattern Recognition, 36, 2757-2767. https://doi.org/10.1016/s0031-3203(03)00175-4
Wang, S., Mathew, A., Chen, Y., Xi, L., Ma, L. and Lee, J. (2009) Empirical Analysis of Support Vec-tor Machine Ensemble Classifiers. Expert Systems with Applications, 36, 6466-6476. https://doi.org/10.1016/j.eswa.2008.07.041
Mehmood, Z. and Asghar, S. (2021) Customizing SVM as a Base Learner with AdaBoost En-semble to Learn from Multi-Class Problems: A Hybrid Approach Ada-Boost-MSVM. Knowledge-Based Systems, 217, Article ID: 106845. https://doi.org/10.1016/j.knosys.2021.106845
Barr Kumarakulasinghe, N., Blomberg, T., Liu, J., Saraiva Leao, A. and Papapetrou, P. (2020) Evaluating Local Interpretable Model-Agnostic Explanations on Clinical Machine Learning Classification Models. 2020 IEEE 33rd International Symposium on Comput-er-Based Medical Systems (CBMS), Rochester, 28-30 July 2020, 7-12. https://doi.org/10.1109/cbms49503.2020.00009
Zafar, M.R. and Khan, N.M. (2019) DLIME: A Deterministic Local Interpretable Model-Agnostic Ex-planations Approach for Computer-Aided Diagnosis Systems. arXiv: 1906.10263.
Shamszare, H. and Choudhury, A. (2023) Clinicians’ Per-ceptions of Artificial Intelligence: Focus on Workload, Risk, Trust, Clinical Deci-sion Making, and Clinical Integration. Healthcare, 11, Article 2308. https://doi.org/10.3390/healthcare11162308
Golden, G., Popescu, C., Israel, S., Perlman, K., Armstrong, C., Fratila, R., et al. (2024) Applying Artificial Intelligence to Clinical Decision Support in Mental Health: What Have We Learned? Health Policy and Technology, 13, Article ID: 100844. https://doi.org/10.1016/j.hlpt.2024.100844
Datta Burton, S., Mahfoud, T., Aicardi, C. and Rose, N. (2021) Clinical Translation of Computational Brain Models: Understanding the Salience of Trust in Clinician-Researcher Relation-ships. Interdisciplinary Science Reviews, 46, 138-157. https://doi.org/10.1080/03080188.2020.1840223
Koch, E., Pardiñas, A.F., O’Connell, K.S., Selvaggi, P., Camacho Collados, J., Babic, A., et al. (2024) How Real-World Data Can Facilitate the Development of Precision Medicine Treatment in Psychiatry. Biological Psychiatry, 96, 543-551. https://doi.org/10.1016/j.biopsych.2024.01.001
Clark, L.A., Cuthbert, B., Lewis-Fernández, R., Narrow, W.E. and Reed, G.M. (2017) Three Approach-es to Understanding and Classifying Mental Disorder: ICD-11, DSM-5, and the National Institute of Mental Health’s Research Domain Criteria (RDOC). Psycho-logical Science in the Public Interest, 18, 72-145. https://doi.org/10.1177/1529100617727266
McGorry, P.D., Hickie, I.B., Yung, A.R., Pantelis, C. and Jackson, H.J. (2006) Clinical Staging of Psychiatric Disorders: A Heuristic Framework for Choosing Earlier, Safer and More Effective Interventions. Australian and New Zealand Journal of Psychiatry, 40, 616-622. https://doi.org/10.1111/j.1440-1614.2006.01860.x
Innocent, E.K. (2024) Enhancing Data Security in Healthcare with Synthetic Data Generation: An Autoencoder and Variational Autoencoder Approach. Master’s Thesis, Oslo Metropolitan University.
Basri, M.A. (2024) Evaluating the Usefulness of Synthetic Data in Healthcare: Applications in Predictive Modeling and Privacy Protection. Master’s Thesis, University of Waterloo.
Alharthi, A., Alqurashi, A., Alharbi, T., Alammar, M., Aldosari, N., Bouchekara, H., Shaaban, Y., Shahriar, M.S. and Al Ayidh, A. (2024) The Role of Explainable AI in Revolu-tionizing Human Health Monitoring. arXiv: 2409.07347.
Hassija, V., Chamola, V., Mahapatra, A., Singal, A., Goel, D., Huang, K., et al. (2023) Inter-preting Black-Box Models: A Review on Explainable Artificial Intelligence. Cogni-tive Computation, 16, 45-74. https://doi.org/10.1007/s12559-023-10179-8
Prasad, A. (2025) Predic-tive Analytics in Clinical Psychology: Role of Machine Learning in Future Mental Health Care. In: Bansal, R., Maqableh, T., Shuklaa, G., Rabby, F. and Lathabha-van, R., Eds., Transforming Neuropsychology and Cognitive Psychology with AI and Machine Learning, IGI Global, 313-332. https://doi.org/10.4018/979-8-3693-9341-3.ch013