Accepted for/Published in: JMIR Formative Research
Date Submitted:
Open Peer Review Period: -
Date Accepted:
Date Submitted to PubMed:
- Ruvini S, Ravi I, Pragalathan A, Nilmini W, Denny M
- Machine Learning Approach to Identifying Empathy Using the Vocals of Mental Health Helpline Counselors: Algorithm Development and Validation
- JMIR Formative Research
- DOI: 10.2196/11848
- PMID: 30303485
- PMCID: 6352016
Machine Learning Approach to Identifying Empathy Using the Vocals of Mental Health Helpline Counselors: Algorithm Development and Validation
Abstract
background
The research study aimed to detect the vocal features immersed in empathic counsellor speech using samples of calls to a mental health (MH) helpline service.
objective
The study aimed to produce an algorithm for the identification of empathy from these features, which could act as a training guide for counsellors and conversational agents who need to transmit empathy in their vocals.
methods
Two annotators with a psychology background and English heritage provided empathy ratings for 57 calls involving female counsellors, as well as multiple short call segments within each of these calls. These ratings were found to be well correlated between the two raters in a sample of six common calls. Using vocal feature extraction from call segments and statistical variable selection methods, such as L1 penalised Least Absolute Shrinkage and Selection Operator (LASSO) and forward selection, a total of 14 significant vocal features were associated with empathic speech. Generalised additive mixed models (GAMM), binary logistics regression with splines and random forest models were employed to obtain an algorithm that differentiated between high and low empathy call segments.
results
The binary logistics regression model reported higher predictive accuracies of empathy (AUC=0.617, 95% CI: 0.613-0.622) compared to the GAMM (AUC=0.605, 95% CI:0.601-0.609) and the random forest model (AUC= 0.600, 95% CI: 0.595-0.604). This difference was statistically significant, as evidenced by the non-overlapping 95% confidence intervals obtained for AUC. The Delong test further validated these results, showing a significant difference in the binary logistic model compared to Random Forest (D=6.443, df =186283, P<.001) and GAMM (Z = 5.846, P<.001). These findings confirm that the binary logistic regression model outperforms the other two models in terms of predictive accuracy for empathy classification.
conclusions
This study suggests that the identification of empathy from vocal features alone is challenging and further research involving multi-modal models (e.g. models incorporating facial expression, words used and vocal features) are encouraged for detecting empathy in the future. This study has several limitations, including a relatively small sample of calls and only two empathy raters. Future research should focus on accommodating multiple raters with varied backgrounds to explore these effects on perceptions of empathy. In addition, considering counsellor vocals from larger more heterogeneous populations, including mixed-gender samples, will allow an exploration of the factors influencing the level of empathy projected in counsellor voices more generally.
clinicaltrial
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it’s website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.