TY - JOUR AU - Sanjeewa, Ruvini AU - Iyer, Ravi AU - Apputhurai, Pragalathan AU - Wickramasinghe, Nilmini AU - Meyer, Denny PY - 2025 DA - 2025/4/16 TI - Machine Learning Approach to Identifying Empathy Using the Vocals of Mental Health Helpline Counselors: Algorithm Development and Validation JO - JMIR Form Res SP - e67835 VL - 9 KW - vocal features KW - voice characteristics KW - empathy KW - mental health care KW - crisis helpline service AB - Background: This research study aimed to detect the vocal features immersed in empathic counselor speech using samples of calls to a mental health helpline service. Objective: This study aimed to produce an algorithm for the identification of empathy from these features, which could act as a training guide for counselors and conversational agents who need to transmit empathy in their vocals. Methods: Two annotators with a psychology background and English heritage provided empathy ratings for 57 calls involving female counselors, as well as multiple short call segments within each of these calls. These ratings were found to be well-correlated between the 2 raters in a sample of 6 common calls. Using vocal feature extraction from call segments and statistical variable selection methods, such as L1 penalized LASSO (Least Absolute Shrinkage and Selection Operator) and forward selection, a total of 14 significant vocal features were associated with empathic speech. Generalized additive mixed models (GAMM), binary logistics regression with splines, and random forest models were used to obtain an algorithm that differentiated between high- and low-empathy call segments. Results: The binary logistics regression model reported higher predictive accuracies of empathy (area under the curve [AUC]=0.617, 95% CI 0.613‐0.622) compared to the GAMM (AUC=0.605, 95% CI 0.601‐0.609) and the random forest model (AUC=0.600, 95% CI 0.595‐0.604). This difference was statistically significant, as evidenced by the nonoverlapping 95% CIs obtained for AUC. The DeLong test further validated these results, showing a significant difference in the binary logistic model compared to the random forest (D=6.443, df=186283, P<.001) and GAMM (Z=5.846, P<.001). These findings confirm that the binary logistic regression model outperforms the other 2 models concerning predictive accuracy for empathy classification. Conclusions: This study suggests that the identification of empathy from vocal features alone is challenging, and further research involving multimodal models (eg, models incorporating facial expression, words used, and vocal features) are encouraged for detecting empathy in the future. This study has several limitations, including a relatively small sample of calls and only 2 empathy raters. Future research should focus on accommodating multiple raters with varied backgrounds to explore these effects on perceptions of empathy. Additionally, considering counselor vocals from larger, more heterogeneous populations, including mixed-gender samples, will allow an exploration of the factors influencing the level of empathy projected in counselor voices more generally. SN - 2561-326X UR - https://formative.jmir.org/2025/1/e67835 UR - https://doi.org/10.2196/67835 DO - 10.2196/67835 ID - info:doi/10.2196/67835 ER -