%0 Journal Article %@ 2561-326X %I JMIR Publications %V 6 %N 9 %P e37838 %T Optimizing Health Coaching for Patients With Type 2 Diabetes Using Machine Learning: Model Development and Validation Study %A Di,Shuang %A Petch,Jeremy %A Gerstein,Hertzel C %A Zhu,Ruoqing %A Sherifali,Diana %+ School of Nursing, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4K1, Canada, 1 9055259140 ext 21435, dsherif@mcmaster.ca %K diabetes health coaching %K artificial intelligence %K reinforcement learning %K health coaching %K patient outcome %K diabetes %K community health %K digital intervention %K health outcome %D 2022 %7 13.9.2022 %9 Original Paper %J JMIR Form Res %G English %X Background: Health coaching is an emerging intervention that has been shown to improve clinical and patient-relevant outcomes for type 2 diabetes. Advances in artificial intelligence may provide an avenue for developing a more personalized, adaptive, and cost-effective approach to diabetes health coaching. Objective: We aim to apply Q-learning, a widely used reinforcement learning algorithm, to a diabetes health-coaching data set to develop a model for recommending an optimal coaching intervention at each decision point that is tailored to a patient’s accumulated history. Methods: In this pilot study, we fit a two-stage reinforcement learning model on 177 patients from the intervention arm of a community-based randomized controlled trial conducted in Canada. The policy produced by the reinforcement learning model can recommend a coaching intervention at each decision point that is tailored to a patient’s accumulated history and is expected to maximize the composite clinical outcome of hemoglobin A1c reduction and quality of life improvement (normalized to [ ​0, 1 ​], with a higher score being better). Our data, models, and source code are publicly available. Results: Among the 177 patients, the coaching intervention recommended by our policy mirrored the observed diabetes health coach’s interventions in 17.5% (n=31) of the patients in stage 1 and 14.1% (n=25) of the patients in stage 2. Where there was agreement in both stages, the average cumulative composite outcome (0.839, 95% CI 0.460-1.220) was better than those for whom the optimal policy agreed with the diabetes health coach in only one stage (0.791, 95% CI 0.747-0.836) or differed in both stages (0.755, 95% CI 0.728-0.781). Additionally, the average cumulative composite outcome predicted for the policy’s recommendations was significantly better than that of the observed diabetes health coach’s recommendations (tn-1=10.040; P<.001). Conclusions: Applying reinforcement learning to diabetes health coaching could allow for both the automation of health coaching and an improvement in health outcomes produced by this type of intervention. %M 36099006 %R 10.2196/37838 %U https://formative.jmir.org/2022/9/e37838 %U https://doi.org/10.2196/37838 %U http://www.ncbi.nlm.nih.gov/pubmed/36099006