Major Depressive Disorder (MDD) is a prevalent psychiatric condition requiring long-term pharmacological management, with escitalopram often prescribed as a first-line treatment. However, optimizing antidepressant dosing remains challenging due to heterogeneous patient responses, complex symptom trajectories, and variable tolerance to side effects. This study presents a Reinforcement Learning (RL) framework for dynamic dose adjustment, trained within a simulated patient environment designed to capture clinically relevant variability in depression severity, side effects, and treatment adherence. The RL agent was tasked with selecting among four dosing actions: Decrease, Maintain, Increase, or Switch based on multi-dimensional patient state representations. An ε-greedy exploration strategy with decaying exploration probability facilitated policy convergence over 30 training episodes. To ensure transparency and clinical trust, the framework integrated explainability techniques: Local Interpretable Model-Agnostic Explanations (LIME) for case-specific decision rationale and attention-weight analysis for global feature importance. Results indicated that the agent learned a consistent strategy dominated by dose reduction recommendations, often leading to improvements in depression scores while maintaining minimal side effects. Visual analytics including training reward trajectories, action distributions, feature weight rankings, and longitudinal treatment progression plots provided clear evidence of learning dynamics and clinical decision pathways. Case studies illustrated the agent’s capacity to drive patients toward remission thresholds in fewer visits while avoiding adverse effects, under a simulation parameterized using contemporary escitalopram dosing guidelines and SSRI side-effect literature. LIME analysis revealed that variables such as high normalized BMI and shorter treatment duration significantly influenced the “Decrease” action, while age and depression severity modulated decision probabilities. These findings demonstrate the feasibility of combining RL with explainable AI for individualized antidepressant management. Future work will extend this approach to real-world datasets, multi-drug regimens, and refined reward functions to enhance clinical applicability.
Cite this paper
Filippis, R. D. and Foysal, A. A. (2025). Reinforcement Learning for Antidepressant Dose Adjustment: An Explainable Agent Approach. Open Access Library Journal, 12, e14449. doi: http://dx.doi.org/10.4236/oalib.1114449.
Yan, G., Zhang, Y., Wang, S., Yan, Y., Liu, M., Tian, M., et al. (2024) Global, Re-gional, and National Temporal Trend in Burden of Major Depressive Disorder from 1990 to 2019: An Analysis of the Global Burden of Disease Study. Psychi-atry Research, 337, Article ID: 115958. https://doi.org/10.1016/j.psychres.2024.115958
Proudman, D., Green-berg, P. and Nellesen, D. (2021) The Growing Burden of Major Depressive Dis-orders (MDD): Implications for Researchers and Policy Makers. PharmacoEco-nomics, 39, 619-625. https://doi.org/10.1007/s40273-021-01040-7
Park, Y., Kim, E.-J., Jeong, H., Park, S. and Lee, M. (2025) Depressive Disorder and Its Social and Genetic Risk Factors: A GBD 2021 Analysis and Meta-Analytic Review.
Anand, L.K., Maqbool, M.S. and Malik, F. (2024) Molecular Mechanisms Implicated with De-pression and Therapeutic Intervention. In: Precision Medicine and Human Health, Bentham Science Publishers, 205-257. https://doi.org/10.2174/9789815223583124010012
Gureje, O., Kola, L. and Afolabi, E. (2007) Epidemiology of Major Depressive Disorder in Elderly Nigerians in the Ibadan Study of Ageing: A Community-Based Survey. The Lan-cet, 370, 957-964. https://doi.org/10.1016/s0140-6736(07)61446-9
Liwinski, T. and Lang, U.E. (2023) Folate and Its Significance in Depressive Disorders and Suicidality: A Comprehensive Narrative Review. Nutrients, 15, Article No. 3859. https://doi.org/10.3390/nu15173859
Vaswani, M., Linda, F.K. and Ramesh, S. (2003) Role of Selective Serotonin Reuptake Inhibitors in Psychiatric Disorders: A Comprehensive Review. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 27, 85-102. https://doi.org/10.1016/s0278-5846(02)00338-x
Pannu, A. and Goyal, R.K. (2025) From Evidence to Practice: A Comprehensive Analysis of Side Ef-fects in Synthetic Anti-Depressant Therapy. Current Drug Safety, 20, 120-147. https://doi.org/10.2174/0115748863301630240417071353
Kennedy, S.H. and Rizvi, S.J. (2009) Emerging Drugs for Major Depressive Disorder. Ex-pert Opinion on Emerging Drugs, 14, 439-453. https://doi.org/10.1517/14728210903107751
Madison, R. (2024) In-fluence of Proton Pump Inhibitors on the Pharmacokinetics and Pharmacody-namics of Selective Serotonin Reuptake Inhibitors. Journal of Clinical Gastroen-terology and Hepatology, 8, Article No. 11.
Ilan, Y. (2022) Next-Generation Personalized Medicine: Implementation of Variability Patterns for Overcoming Drug Resistance in Chronic Diseases. Journal of Personalized Medicine, 12, Article No. 1303. https://doi.org/10.3390/jpm12081303
Khoury, T. and Ilan, Y. (2019) Introducing Patterns of Variability for Overcoming Compensatory Ad-aptation of the Immune System to Immunomodulatory Agents: A Novel Method for Improving Clinical Response to Anti-TNF Therapies. Frontiers in Immunolo-gy, 10, Article No. 2726. https://doi.org/10.3389/fimmu.2019.02726
Li, W., Wen, C., Ye, B., Gujarathi, P., Suryawanshi, M., Vinchurkar, K., et al. (2025) Targeted Drug Monitoring in Oncology for Personalized Treatment with Use of Next Generation Analytics. Discover Oncology, 16, Article No. 1523. https://doi.org/10.1007/s12672-025-03376-4
Sailer, V., von Amsberg, G., Duensing, S., Kirfel, J., Lieb, V., Metzger, E., et al. (2022) Experimental in Vitro, ex Vivo and in Vivo Models in Prostate Cancer Research. Nature Reviews Urology, 20, 158-178. https://doi.org/10.1038/s41585-022-00677-z
Grover, S., Gautam, S., Jain, A., Gautam, M. and Vahia, V. (2017) Clinical Practice Guidelines for the Manage-ment of Depression. Indian Journal of Psychiatry, 59, S34-S50. https://doi.org/10.4103/0019-5545.196973
Burke, M.J. and Preskorn, S.H. (1999) Therapeutic Drug Monitoring of Antidepressants: Cost Implications and Relevance to Clinical Practice. Clinical Pharmacokinetics, 37, 147-165. https://doi.org/10.2165/00003088-199937020-00004
Santarsieri, D. and Schwartz, T. (2015) Antidepressant Efficacy and Side-Effect Burden: A Quick Guide for Clinicians. Drugs in Context, 4, Article ID: 212290. https://doi.org/10.7573/dic.212290
Cleare, A., Pariante, C., Young, A., Anderson, I., Christmas, D., Cowen, P., et al. (2015) Evidence-Based Guidelines for Treating Depressive Disorders with Antidepressants: A Revision of the 2008 British Association for Psychopharmacology Guidelines. Journal of Psychophar-macology, 29, 459-525. https://doi.org/10.1177/0269881115581093
Solmi, M., Miola, A., Croatto, G., Pigato, G., Favaro, A., Fornaro, M., et al. (2021) How Can We Im-prove Antidepressant Adherence in the Management of Depression? A Target-ed Review and 10 Clinical Recommendations. Brazilian Journal of Psychiatry, 43, 189-202. https://doi.org/10.1590/1516-4446-2020-0935
Lustberg, M.B., Kuderer, N.M., Desai, A., Bergerot, C. and Lyman, G.H. (2023) Mitigating Long-Term and Delayed Adverse Events Associated with Cancer Treatment: Implications for Survivorship. Nature Reviews Clinical Oncology, 20, 527-542. https://doi.org/10.1038/s41571-023-00776-9
van Trigt, V.R., Zandber-gen, I.M., Pelsma, I.C.M., Bakker, L.E.H., Verstegen, M.J.T., van Furth, W.R., et al. (2023) Care Trajectories of Surgically Treated Patients with a Prolactinoma: Why Did They Opt for Surgery? Pituitary, 26, 611-621. https://doi.org/10.1007/s11102-023-01346-z
Sandesh, H. (2025) Reinforcement Learning for Personalized Therapies Designing Adaptive Treat-ment Plans through Intelligent Algorithms. International Journal of Engineering Development and Research, 13, 110-118.
Ali, H. (2022) Reinforcement Learning in Healthcare: Optimizing Treatment Strategies, Dynamic Resource Al-location, and Adaptive Clinical Decision-Making. International Journal of Com-puter Applications Technology and Research, 11, 88-104.
Frommeyer, T.C., Gilbert, M.M., Fursmidt, R.M., Park, Y., Khouzam, J.P., Brittain, G.V., et al. (2025) Reinforcement Learning and Its Clinical Applications within Healthcare: A Systematic Review of Precision Medicine and Dynamic Treatment Regimes. Healthcare, 13, Article No. 1752. https://doi.org/10.3390/healthcare13141752
Nguyen, T.T., Nguyen, N.D. and Nahavandi, S. (2020) Deep Re-inforcement Learning for Multiagent Systems: A Review of Challenges, Solu-tions, and Applications. IEEE Transactions on Cybernetics, 50, 3826-3839. https://doi.org/10.1109/tcyb.2020.2977374
O’Doherty, J.P., Cockburn, J. and Pauli, W.M. (2017) Learning, Reward, and Decision Mak-ing. Annual Review of Psychology, 68, 73-100. https://doi.org/10.1146/annurev-psych-010416-044216
Wong, A., Bäck, T., Kononova, A.V. and Plaat, A. (2022) Deep Multiagent Reinforcement Learning: Challenges and Directions. Artificial Intelligence Review, 56, 5023-5056. https://doi.org/10.1007/s10462-022-10299-x
Singh, M.K. and Thase, M.E. (2025) Current Progress in Targeted Pharmacotherapy to Treat Symptoms of Major Depressive Disorder: Moving from Broad-Spectrum Treatments to Precision Psychiatry. CNS Spectrums, 30, 1-45. https://doi.org/10.1017/s1092852925000094
Maj, M., Stein, D.J., Par-ker, G., Zimmerman, M., Fava, G.A., De Hert, M., et al. (2020) The Clinical Char-acterization of the Adult Patient with Depression Aimed at Personalization of Management. World Psychiatry, 19, 269-293. https://doi.org/10.1002/wps.20771
Chiappini, S., Sampogna, G., Ven-triglio, A., Menculini, G., Ricci, V., Pettorruso, M., et al. (2025) Emerging Strate-gies and Clinical Recommendations for the Management of Novel Depression Subtypes. Expert Review of Neurotherapeutics, 25, 443-463. https://doi.org/10.1080/14737175.2025.2470973
Soleimani, G., Nitsche, M.A., Hanlon, C.A., Lim, K.O., Opitz, A. and Ekhtiari, H. (2025) Four Dimensions of Individualization in Brain Stimulation for Psychiatric Disorders: Context, Target, Dose, and Timing. Neuropsychopharmacology, 50, 857-870. https://doi.org/10.1038/s41386-025-02094-3
Ngiam, K.Y. and Khor, I.W. (2019) Big Data and Machine Learning Algorithms for Health-Care Deliv-ery. The Lancet Oncology, 20, e262-e273. https://doi.org/10.1016/s1470-2045(19)30149-4
Elhaddad, M. and Hamam, S. (2024) AI-Driven Clinical Decision Support Systems: An Ongoing Pursuit of Potential. Cureus, 16, e57728. https://doi.org/10.7759/cureus.57728
Xu, Q., Xie, W., Liao, B., Hu, C., Qin, L., Yang, Z., et al. (2023) Interpretability of Clinical Decision Support Sys-tems Based on Artificial Intelligence from Technological and Medical Perspec-tive: A Systematic Review. Journal of Healthcare Engineering, 2023, Article ID: 9919269. https://doi.org/10.1155/2023/9919269
Rane, N., Choudhary, S. and Rane, J. (2023) Explainable Artificial Intelligence (XAI) in Healthcare: Interpretable Models for Clinical Decision Support. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4637897
Tsoupras, G. and Syed, Z.A. (2025) AI-Driven Decision Support Systems for Early Breast Cancer Detec-tion: Adoption Implications in Healthcare Contexts.
Tun, H.M., Rahman, H.A., Naing, L. and Malik, O.A. (2025) Trust in Artificial Intelligence-Based Clin-ical Decision Support Systems among Health Care Workers: Systematic Review. Journal of Medical Internet Research, 27, e69678. https://doi.org/10.2196/69678
Vandvik, P.O., Brandt, L., Alonso-Coello, P., Treweek, S., Akl, E.A., Kristiansen, A., et al. (2013) Creating Clinical Practice Guidelines We Can Trust, Use, and Share: A New Era Is Imminent. Chest, 144, 381-389. https://doi.org/10.1378/chest.13-0746
Shaker, M.S. and Verdi, M. (2024) Operationalizing Shared Decision Making in Clinical Practice. Allergy and Asthma Proceedings, 45, 398-403. https://doi.org/10.2500/aap.2024.45.240048
Rosenbaum, S.E., Moberg, J., Glenton, C., Schünemann, H.J., Lewin, S., Akl, E., et al. (2018) Developing Evidence to Decision Frameworks and an Interactive Evidence to Decision Tool for Making and Using Decisions and Recommendations in Health Care. Global Challenges, 2, Article ID: 1700081. https://doi.org/10.1002/gch2.201700081
Zafar, M.R. and Khan, N. (2021) Deterministic Local Interpretable Model-Agnostic Explanations for Sta-ble Explainability. Machine Learning and Knowledge Extraction, 3, 525-541. https://doi.org/10.3390/make3030027
Zhao, X.Y., Huang, W., Huang, X.W., Robu, V. and Flynn, D. (2021) BayLIME: Bayesian Local Interpretable Model-Agnostic Explanations. Proceedings of the 37th Conference on Uncer-tainty in Artificial Intelligence (UAI 2021), Vol. 161, 887-896.
Zafar, M.R. and Khan, N.M. (2019) DLIME: A Deterministic Local Interpretable Mod-el-Agnostic Explanations Approach for Computer-Aided Diagnosis Sys-tems.
Hadidi, R. and Jeyasurya, B. (2013) Reinforcement Learning Based Real-Time Wide-Area Stabilizing Control Agents to Enhance Power System Sta-bility. IEEE Transactions on Smart Grid, 4, 489-497. https://doi.org/10.1109/tsg.2012.2235864
Massaoudi, M.S., Abu-Rub, H. and Ghrayeb, A. (2023) Navigating the Landscape of Deep Reinforcement Learning for Power System Stability Control: A Review. IEEE Access, 11, 134298-134317. https://doi.org/10.1109/access.2023.3337118
Khetarpal, K., Riemer, M., Rish, I. and Precup, D. (2022) Towards Continual Reinforcement Learning: A Review and Perspectives. Journal of Artificial Intelligence Research, 75, 1401-1476. https://doi.org/10.1613/jair.1.13673
Adam, S., Busoniu, L. and Babuska, R. (2012) Experience Replay for Real-Time Reinforcement Learning Control. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42, 201-212. https://doi.org/10.1109/tsmcc.2011.2106494
Busoniu, L., Babuska, R. and De Schutter, B. (2008) A Comprehensive Survey of Multiagent Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Appli-cations and Reviews), 38, 156-172. https://doi.org/10.1109/tsmcc.2007.913919
Angelini, J., Talotta, R., Roncato, R., Fornasier, G., Barbiero, G., Dal Cin, L., et al. (2020) JAK-Inhibitors for the Treatment of Rheumatoid Arthritis: A Focus on the Present and an Out-look on the Future. Biomolecules, 10, Article No. 1002. https://doi.org/10.3390/biom10071002
Lu, H.R., Fang, L.Y., Zhang, R.D., Li, X.L., Cai, J.Z., Cheng, H.M., Tang, L., et al. (2025) Alignment and Safety in Large Language Models: Safety Mechanisms, Training Paradigms, and Emerging Challenges.
Tsoukas, H., Hadjimichael, D., Nair, A.K., Pyrko, I. and Woolley, S. (2024) Judgment in Business and Management Research: Shedding New Light on a Familiar Concept. Academy of Management Annals, 18, 626-669. https://doi.org/10.5465/annals.2022.0175
Katz, Y.J. (2015) Affective and Cognitive Correlates of Cell-Phone Based SMS Delivery of Learning: Learner Autonomy, Learner Motivation and Learner Satisfaction. IFIP TC3 Working Conference “A New Culture of Learning: Computing and Next Generations”, Vilnius, 1-3 July 2015, 131.
Lowry, S.Z., Abbott, P., Gibbons, M.C., Lowry, S.Z., North, R., Patterson, E.S., et al. (2012) Technical Evaluation, Testing, and Validatiaon of the Usability of Electronic Health Records. US De-partment of Commerce, National Institute of Standards and Technolo-gy.
Nohr, C., Jensen, S., Borycki, E.M. and Kushniruk, A. (2013) From Us-ability Testing to Clinical Simulations: Bringing Context into the Design and Evaluation of Usable and Safe Health Information Technologies. Yearbook of Medical Informatics, 22, 78-85. https://doi.org/10.1055/s-0038-1638836
Brouwers, M.C., Spithoff, K., Kerkvliet, K., Alonso-Coello, P., Burgers, J., Cluzeau, F., et al. (2020) Develop-ment and Validation of a Tool to Assess the Quality of Clinical Practice Guideline Recommendations. JAMA Network Open, 3, e205535. https://doi.org/10.1001/jamanetworkopen.2020.5535
Goldsack, J.C., Coravos, A., Bakker, J.P., Bent, B., Dowling, A.V., Fitzer-Attas, C., et al. (2020) Verification, Analytical Validation, and Clinical Validation (V3): The Foundation of Determining Fit-for-Purpose for Biometric Monitoring Technologies (BioM-eTs). NPJ Digital Medicine, 3, Article No. 55. https://doi.org/10.1038/s41746-020-0260-4