TY  - JOUR
AU  - Di, Shuang
AU  - Petch, Jeremy
AU  - Gerstein, Hertzel C
AU  - Zhu, Ruoqing
AU  - Sherifali, Diana
PY  - 2022
DA  - 2022/9/13
TI  - Optimizing Health Coaching for Patients With Type 2 Diabetes Using Machine Learning: Model Development and Validation Study
JO  - JMIR Form Res
SP  - e37838
VL  - 6
IS  - 9
KW  - diabetes health coaching
KW  - artificial intelligence
KW  - reinforcement learning
KW  - health coaching
KW  - patient outcome
KW  - diabetes
KW  - community health
KW  - digital intervention
KW  - health outcome
AB  - Background: Health coaching is an emerging intervention that has been shown to improve clinical and patient-relevant outcomes for type 2 diabetes. Advances in artificial intelligence may provide an avenue for developing a more personalized, adaptive, and cost-effective approach to diabetes health coaching. Objective: We aim to apply Q-learning, a widely used reinforcement learning algorithm, to a diabetes health-coaching data set to develop a model for recommending an optimal coaching intervention at each decision point that is tailored to a patient’s accumulated history. Methods: In this pilot study, we fit a two-stage reinforcement learning model on 177 patients from the intervention arm of a community-based randomized controlled trial conducted in Canada. The policy produced by the reinforcement learning model can recommend a coaching intervention at each decision point that is tailored to a patient’s accumulated history and is expected to maximize the composite clinical outcome of hemoglobin A1c reduction and quality of life improvement (normalized to [ ​0, 1 ​], with a higher score being better). Our data, models, and source code are publicly available. Results: Among the 177 patients, the coaching intervention recommended by our policy mirrored the observed diabetes health coach’s interventions in 17.5% (n=31) of the patients in stage 1 and 14.1% (n=25) of the patients in stage 2. Where there was agreement in both stages, the average cumulative composite outcome (0.839, 95% CI 0.460-1.220) was better than those for whom the optimal policy agreed with the diabetes health coach in only one stage (0.791, 95% CI 0.747-0.836) or differed in both stages (0.755, 95% CI 0.728-0.781). Additionally, the average cumulative composite outcome predicted for the policy’s recommendations was significantly better than that of the observed diabetes health coach’s recommendations (tn-1=10.040; P<.001). Conclusions: Applying reinforcement learning to diabetes health coaching could allow for both the automation of health coaching and an improvement in health outcomes produced by this type of intervention. 
SN  - 2561-326X
UR  - https://formative.jmir.org/2022/9/e37838
UR  - https://doi.org/10.2196/37838
UR  - http://www.ncbi.nlm.nih.gov/pubmed/36099006
DO  - 10.2196/37838
ID  - info:doi/10.2196/37838
ER  -