Background: Although behavioral interventions have been found to be efficacious and effective in randomized clinical trials for most mental illnesses, the quality and efficacy of mental health care delivery remains inadequate in real-world settings, partly owing to suboptimal treatment fidelity. This “therapist drift” is an ongoing issue that ultimately reduces the effectiveness of treatments; however, until recently, there have been limited opportunities to assess adherence beyond large randomized controlled trials.
Objective: This study explored therapists’ use of a standard component that is pertinent across most behavioral treatments—prompting clients to summarize their treatment session as a means for consolidating and augmenting their understanding of the session and the treatment plan.
Methods: The data set for this study comprised 17,607 behavioral treatment sessions administered by 322 therapists to 3519 patients in 37 behavioral health care programs across the United States. Sessions were captured by a therapy-specific artificial intelligence (AI) platform, and an automatic speech recognition system transcribed the treatment meeting and separated the data to the therapist and client utterances. A search for possible session summary prompts was then conducted, with 2 psychologists validating the text that emerged.
Results: We found that despite clinical recommendations, only 54 (0.30%) sessions included a summary. Exploratory analyses indicated that session summaries mostly addressed relationships (n=27), work (n=20), change (n=6), and alcohol (n=5). Sessions with meeting summaries were also characterized by greater therapist interventions and included greater use of validation, complex reflections, and proactive problem-solving techniques.
Conclusions: To the best of our knowledge, this is the first study to assess a large, diverse data set of real-world treatment practices. Our findings provide evidence that fidelity with the core components of empirically designed psychological interventions is a challenge in real-world settings. The results of this study can inform the development of machine learning and AI algorithms and offer nuanced, timely feedback to providers, thereby improving the delivery of evidence-based practices and quality of mental health care services and facilitating better clinical outcomes in real-world settings.
Mental health is a major global health concern. Mental illnesses will affect close to half of the world’s population at some point in their lives . In addition to the personal toll of these illnesses, they also cost the global economy US $1 trillion per year in lost productivity alone [ ]. Cognitive behavioral interventions have a robust evidence base for their efficacy and effectiveness in treating a variety of mental health issues. However, an ongoing challenge in the field has been how to “bridge the gap” between the laboratory or classroom and the clinic, with the data suggesting that the implementation of these methods in real-world settings is often much more challenging. One hypothesis for this is that therapists in real world settings may not be as adherent to the protocols on which these treatments have been tested and the evidence base established.
Cognitive behavioral treatments are designed to help individuals identify and alter maladaptive cognitions, emotions, interpersonal relationships, and problematic behaviors to reduce symptoms and lead more productive and satisfying lives. At this point, numerous versions of behavioral interventions have been manualized and studied in comparative trials with robust effects versus waitlist and placebo conditions, as well as medications and other forms of psychotherapy and disseminated worldwide. The need for focused behavioral treatments has become stronger following the COVID-19 pandemic, given the global increase in mental health disorders [, ] and the resulting demand for mental health services, coupled with a dramatic shortage of clinicians [ ]. Given this need-service gap, it is imperative that behavioral treatments be delivered in accordance with the empirically based guidelines based on which they were established in order to maximize the potential of replicating the outcomes of the clinical trials with individuals “in the real world” with mental health concerns.
Therapist Adherence to Treatments as Designed
“Therapist drift,” or the tendency of clinicians to adhere only partially to established empirically supported practices, has been documented in the literature with respect to many treatment models [, ]. Therapists’ attitudes toward evidence-based practices (EBPs), their licensing status, and organization characteristics combinatorically affect their use of recommended strategies [ , ]. Further, clinical trainings or specialty workshops for EBPs are often not enough to facilitate change of practice and improved client outcomes unless some ongoing follow-up sessions are offered for supervision [ ].
Complicating matters further, therapy is often like a “black box” in which little is done to monitor or encourage clinicians to adhere to core active ingredients of empirically supported treatments. As such, it has been proposed that clinical research should utilize observations that are less subject to bias [, ]. In addition, data from real-world treatments indicate that few empirically based therapeutic techniques are in effect, even when clinicians report that they offered evidence-supported behavioral treatments [ ].
Session Summary: An Example of Therapist Adherence
Empirically supported protocols are often built on a basic set of theoretical ideas. Although certain aspects may differ amongst protocols, certain essential concepts remain common to them all . Dryden [ ] argues that it is essential for the client to leave the session with what they view as important takeaways. As the session comes to a close, it is best practice for the therapist to invite the client to summarize the session rather than the therapist summarizing what transpired for the client during the session [ ]. Indeed, one of the key elements in almost every manualized treatment is encouraging the client to compose a summary at the end of each therapy session [ - ]. This form of feedback is described as a means of helping clients review their own understanding of the session and the rationale for the interventions provided [ ]. Further, a session summary provides clients an opportunity for feedback, which empowers clients to conceptualize their needs, assess their progress toward their goals, and make their own decisions. It also allows the therapist to ensure that key components of the sessions have been understood and highlighted. Therefore, a session summary could be viewed as a common, transdiagnostic evidence-based component that is one of the key ingredients related to the effectiveness of the session.
The importance of session summary has not been overlooked by researchers, and reports on treatment studies often mention this strategy [, ]. Perlich and Meinel [ ] even developed a tool for collaborative session summary, in which the client and the therapist review the session and their takeaways. However, when asked about their own adaptations of EBPs, 32% of therapists reported removing components of the intervention, with the session summary being the most frequently omitted part from the therapy process [ ]. Similarly, one of the most common challenges of community mental health therapists learning cognitive behavioral therapy (CBT) was their limited attempts to solicit the client’s feedback [ ], and in text-based CBT, very few therapists summarized the session [ ].
While the session summary is just one example of possible therapist drift , given its importance of reinforcing the therapeutic process and strengthening the learnings obtained during treatment, and that it is a technique that can be delivered quite briefly and has a high face validity across interventions, it may serve as a proxy for how much the clinician is adhering to the full range of tasks that are essential for evidence-based therapies (EBTs) to be effective. Therefore, this study explored how common are session summaries in real-world behavioral treatments.
Settings and Interventions
This study is based on the retrospective analysis of fully anonymized data from behavioral health treatments provided in 37 behavioral health care programs across the United States. All client participants received either individual, group, or couples therapy in either an outpatient or intensive outpatient program. Clients sought treatment for a range of mental health concerns, and therapists were free to provide the intervention they believed was most suitable for the client’s presenting problem and characteristics. Therapists were either psychologists, social workers, or licensed counselors. The sample comprised 17,607 treatment sessions administered by 322 therapists to 3519 clients.
The Eleos Health Platform
All sessions were processed via an artificial intelligence (AI) therapy-specific platform (Eleos Health). This platform captures the treatment conversations, provides a verbatim session transcript, and summarizes intervention insights to inform treatment-planning and clinical decision-making . The platform collects key metrics from treatment sessions and integrates them with standardized evidence-based self-report measures, leveraging insights developed through machine learning (ML) and natural language processing (NLP) analysis of large treatment data sets [ ]. Eleos also uses AI methods to increase adherence with clinical standards and drive operational efficiency.
The platform is Health Insurance Portability and Accountability Act–compliant, and all participants consented to have their sessions processed through it. This study was approved by an external institutional research board, Sterling IRB (9545).
To fully capture and make sense of the speech data, we developed an algorithm to identify the specific interventions carried out in behavioral treatment sessions and consequently determine whether a session summary was recorded. First, the sessions were transcribed using an automatic speech recognition system (ASR) as well as a domain-based text-cleaning algorithm. Second, since the session transcripts are unstructured data, we developed a treatment-oriented, speaker diarization ML model in accordance with their utterances in the treatment session. A team of trained graduate-level clinicians tagged over 2500 therapy conversations and labeled the speakers as either “patient” or “therapist.” These data were consequently used in a model that analyzed the full transcribed session and assigned a speaker label for each participant. Third, we applied the term frequency–inverse document frequency, a commonly used feature generation method, to identify if the speaker is either the therapist or a client. As a classification algorithm, we used a logistic regression model with a binary cross-entropy loss and trained the model using stochastic gradient descent. On a session level, our in-house solution demonstrated 98% accuracy in differentiating between speakers in therapy sessions.
Eleos’s NLP-based engine extracted potential session summary from therapists’ utterances, identifying the frequencies of lexical terms, which were said during the latter 20% of the session, and retrieving phrases such as the following: “[I] just want to review what we talked about“; “So what did you learn here today?”; “If it was something that you were going to take away from today's session, what would it be?“; “What's your take home message from today?”; “Alright, let's do our summary for the day“; “'kind of [your] two main takeaways”; and so on. For quality assurance, 2 psychologists (SSS and TK) reviewed 682 sessions that included language associated with a session summary and indicated whether the algorithmically identified text did in fact reflect a meeting review prompt.outlines the data analysis approach used in this study. Finally, we compared sessions with and those without a summary on the following variables: most commonly discussed topics, therapist-to-client listening ratio, the most commonly used intervention techniques, and content of the progress note that the therapist had generated for this session in the program’s electronic health record (EHR). Further, to assign the sentiment expressed during these specific sessions, we applied Valence Aware Dictionary and sEntiment Reasoner (VADER) [ ] on the therapist’s and patient’s texts, independently. VADER is a lexicon and rule-based sentiment analysis tool that uses a sentiment lexicon and a list of lexical features (eg, words) that are labeled in accordance with their semantic orientation as either positive or negative [ ]. VADER not only classifies the data to either positive or negative, but also provides a score to indicate the strength of the sentiment detected [ ].
Provision of a Session Summary
Our analysis found that only 54 of the 17,607 (0.30%) behavioral treatment sessions included a session summary. Of these 54 treatment meetings, session summaries most commonly addressed interpersonal relationships with family and friends (n=27), issues related with work (n=20), the word “change” (n=6), and alcohol (n=5).
Characteristics of Sessions Including Summaries
Data were further analyzed to review descriptive differences between therapy sessions with and those without a summary. Sessions that included a prompt to help the client summarize the meeting had a lower therapist listening ratio (33% vs 49%), indicating that therapists were less verbally active throughout the meetings that included feedback. The number of therapeutic interventions detected was greater in sessions that included a session summary; on average, therapists used 17% more interventions in meetings with a summary. Sessions with meeting summaries also included greater therapist use of validation, complex reflections, and proactive problem-solving techniques. Moreover, therapists who prompted their client to summarize their session were 83.3% more likely to assign treatment homework and report it on their EHR progress note. Therapists who encouraged a summary also had a 12% greater likelihood of completing their progress note within 48 hours of the session’s date: 69% of those who had asked for a summary also completed their progress note within this time frame, compared to 61% of therapists who did prompt a summary.provides an overview of the differences between sessions including and not including summaries.
Applying the VADER algorithm in sessions that included summaries revealed that the clients expressed, on average, slightly more positive and negative emotions (0.6% and 4%, respectively) than those with no summary. However, the therapists tended to express less emotion in the sessions including a summary, expressing 9% less positive statements and 7% less negative statements.provides an overview of the differences between sessions including and those not including summaries around statements’ sentiments.
|Sessions||Listening ratio, %||Progress note completion rate, %||Type of interventions detected|
|Without a summary||33||61||1.04|
|With a summary||49||69||1.26|
|Without a summary||7.70||4.50||8.90||4.20|
|With a summary||7.40||4.50||8.20||3.90|
While the evidence base is strong and robust for behavioral interventions, their efficacy is tied to maintaining a structure and including certain key components in each session. Therapist drift from the key active ingredients of validated treatment protocols could compromise the efficacy and effectiveness of the treatments, thus limiting the impact of treatment on the individual . This study examined the practice guidelines versus practice in real-world behavioral health care settings as they pertain to a key component found in most behavioral interventions: encouraging clients to review and summarize their treatment session [ ]. Session summaries are important because they allow the client an opportunity to reflect back on the treatment meeting, their developing understanding of the maladaptive processes underlying their symptoms, as well as some effective coping strategies they could employ. They also allow the therapist to ensure that key components of the sessions have been understood and highlighted. In controlled and case-series studies reported in the literature, the technique of requesting feedback is stated explicitly [ ]. This study found that very few therapists provide feedback to their clients in the form of a session summary. Our findings suggest that providers who encouraged their clients to reflect on their treatment demonstrated a more active therapy style—their sessions were characterized with more back-and-forth exchanges between the therapist and the client, they provided more interventions during the meeting, and they even tended to complete their progress note faster. These results suggest that therapist adherence to at least one of the key components of most empirically supported behavioral treatments was absent in most of the real-world sessions we reviewed.
Comparison With Prior Work
The findings of this study indicate that in contrast to guidance in treatment protocols, therapists delivering behavioral treatments in real-world settings rarely encourage their clients to reflect on the session during their meeting. To the best of our knowledge, this study is the first to evaluate a large and diverse data set of actual therapy sessions. These findings extend the results of previous studies that have exclusively relied on practitioner self-report and provide insight on how therapists practice in real-world treatment settings . Therapists may overestimate their adherence to practice guidelines, as 32% reported not providing all parts of treatments [ ], while this study suggests that adherence rates are much smaller. Session summary, or feedback, can be perceived as a method for prompting clients to form implementation intentions, thereby likely facilitating greater treatment impact; however, prior research has found that therapists do not often explicitly discuss with their clients to plan actions as a result of the treatment session [ ]. Further, higher-caseload therapists reported feeling that learning about new EBTs would be time-consuming, which consequently could serve as a barrier to implementing these techniques [ ]. In light of this research, it may not be surprising that therapists do not adhere to EBT recommendations despite realizing their potential benefits to service users. Of note, it has been proposed in the literature that treatment protocols are difficult to administer in the field as originally designed in controlled studies, and that “flexibility within fidelity” should be practiced in order to maximize the effects of these programs [ ]. Hence, a systematic understanding of the context affecting variations from prescribed practice and omissions of specific techniques is warranted.
This study utilized data from 17,607 sessions taking place in behavioral health clinics across the United States. The data are likely more representative of the therapist behaviors occurring in real-world settings than are the findings of controlled studies. Nonetheless, this study has limitations. The anonymized database did not include demographic and clinical information of the clients and therapists, which could have enriched our analysis. Future studies should also collect explicit data on the treatment that was provided and how it maps on to the client’s treatment plan. Further, the low number of sessions with summary statements limited our ability to utilize the sentiment and content analyses. Additionally, the analysis did not include outcome data such as symptom reduction or client satisfaction, which are important to assess in the context of the treatment process. From a theoretical and practical standpoint, interviewing therapists about their considerations of using strategies will help better define underlying processes affecting behavioral treatment implementation.
Given the importance of following treatment protocols as initially intended, there is much potential in automating timely feedback for therapists. This study is the first to our knowledge that provides real-time, observational data on clinical practice in real-world settings. As such, it provides a new perspective to how clinicians provide therapy that can enrich that data captured by therapist self-reports. Empirically supported ML and AI algorithms can offer clinicians, trainers, supervisors, and stakeholders nuanced observations on treatment adherence, thereby improving the quality of implementation, dissemination, and ultimately, effectiveness of mental health treatments.
Data Sharing Statement
The data that support the findings of this study are not publicly available owing to privacy and ethical restrictions.
Conflicts of Interest
SSS, SJ, GP, and TK are employees of Eleos Health Inc, which created the platform providing the data for this study. SAR is an unpaid advisor to Eleos Health Inc.
- Trautmann S, Rehm J, Wittchen H. The economic costs of mental disorders: Do our societies react appropriately to the burden of mental disorders? EMBO Rep 2016 Sep;17(9):1245-1249 [FREE Full text] [CrossRef] [Medline]
- Chisholm D, Sweeny K, Sheehan P, Rasmussen B, Smit F, Cuijpers P, et al. Scaling-up treatment of depression and anxiety: a global return on investment analysis. Lancet Psychiatry 2016 May;3(5):415-424 [FREE Full text] [CrossRef]
- Santomauro D, Mantilla Herrera AM, Shadid J, Zheng P, Ashbaugh C, Pigott D, et al. Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. The Lancet 2021 Nov;398(10312):1700-1712 [FREE Full text] [CrossRef]
- OECD. Tackling the mental health impact of the COVID-19 crisis: An integrated, whole-of-society response. OECD Policy Responses to Coronavirus (COVID-19). OECD Publishing, Paris 2021 [FREE Full text] [CrossRef]
- Waller G. Evidence-based treatment and therapist drift. Behav Res Ther 2009 Feb;47(2):119-127. [CrossRef] [Medline]
- Waller G, Turner H. Therapist drift redux: Why well-meaning clinicians fail to deliver evidence-based therapy, and how to get back on track. Behav Res Ther 2016 Feb;77:129-137 [FREE Full text] [CrossRef] [Medline]
- Creed T, Crane M, Calloway A, Olino T, Kendall P, Stirman SW. Changes in Community Clinicians' Attitudes and Competence following a Transdiagnostic Cognitive Behavioral Therapy Training. Implement Res Pract 2021 Jan 01;2:26334895211030220 [FREE Full text] [CrossRef] [Medline]
- Lau A, Lind T, Motamedi M, Lui J, Kuckertz M, Innes-Gomberg D, et al. Prospective predictors of sustainment of multiple EBPs in a system-driven implementation context: Examining sustained delivery based on administrative claims. Implementation Research and Practice 2021 Dec 23;2:263348952110578 [FREE Full text] [CrossRef]
- Beidas R, Kendall P. Training Therapists in Evidence-Based Practice: A Critical Review of Studies From a Systems-Contextual Perspective. Clin Psychol (New York) 2010 Mar;17(1):1-30 [FREE Full text] [CrossRef] [Medline]
- Bond G, Drake R. Assessing the Fidelity of Evidence-Based Practices: History and Current Status of a Standardized Measurement Methodology. Adm Policy Ment Health 2020 Nov;47(6):874-884 [FREE Full text] [CrossRef] [Medline]
- Lewis C, Marti C, Marriott B, Scott K, Ayer D. Patterns of practice in community mental health treatment of adult depression. Psychother Res 2019 Jan;29(1):70-77 [FREE Full text] [CrossRef] [Medline]
- Waller G, Stringer H, Meyer C. What cognitive behavioral techniques do therapists report using when delivering cognitive behavioral therapy for the eating disorders? J Consult Clin Psychol 2012 Feb;80(1):171-175 [FREE Full text] [CrossRef] [Medline]
- Kendall P, Frank H. Implementing evidence-based treatment protocols: Flexibility within fidelity. Clin Psychol (New York) 2018 Dec;25(4) [FREE Full text] [CrossRef] [Medline]
- Dryden W. Single‐Session One‐At‐A‐Time Therapy: A Personal Approach. Aust N Z J Fam Ther 2020 Oct 08;41(3):283-301 [FREE Full text] [CrossRef]
- Beck JS. Cognitive Behavior Therapy (Third Edition): Basics and Beyond. New York, NY: Guilford Press; 2020.
- Dobson DJ, Dobson KS. In-Session Structure and Collaborative Empiricism. Cogn Behav Pract 2013 Nov;20(4):410-418. [CrossRef]
- Rego SA. Treatment Plans and Interventions for Obsessive-Compulsive Disorder. New York, NY: Guilford Press; 2016.
- Ribeiro da Silva D, Rijo D, Castilho P, Gilbert P. The Efficacy of a Compassion-Focused Therapy–Based Intervention in Reducing Psychopathic Traits and Disruptive Behavior: A Clinical Case Study With a Juvenile Detainee. Clinical Case Studies 2019 May 15;18(5):323-343 [FREE Full text] [CrossRef]
- Sadeh-Sharvit S, Lock J. Parents with Eating Disorders: An Intervention Guide. New York, NY: Routledge; 2018.
- Waltman S, Hall B, McFarr L, Beck A, Creed T. In-Session Stuck Points and Pitfalls of Community Clinicians Learning CBT: Qualitative Investigation. Cogn Behav Pract 2017 May;24(2):256-267 [FREE Full text] [CrossRef]
- Ginsburg G, Pella J, Pikulski P, Tein J, Drake K. School-Based Treatment for Anxiety Research Study (STARS): a Randomized Controlled Effectiveness Trial. J Abnorm Child Psychol 2020 Mar;48(3):407-417 [FREE Full text] [CrossRef] [Medline]
- Kavanagh D, Sitharthan G, Young R, Sitharthan T, Saunders J, Shockley N, et al. Addition of cue exposure to cognitive-behaviour therapy for alcohol misuse: a randomized trial with dysphoric drinkers. Addiction 2006 Aug;101(8):1106-1116 [FREE Full text] [CrossRef] [Medline]
- Perlich A, Meinel C. Automatic Treatment Session Summaries in Psychotherapy – A Step towards Therapist-Patient Cooperation. Procedia Computer Science 2015;63:276-283 [FREE Full text] [CrossRef]
- Dyson M, Chlebowski C, Brookman-Frazee L. Therapists' Adaptations to an Intervention to Reduce Challenging Behaviors in Children with Autism Spectrum Disorder in Publicly Funded Mental Health Services. J Autism Dev Disord 2019 Mar;49(3):924-934 [FREE Full text] [CrossRef] [Medline]
- Cummins R, Ewbank M, Martin A, Tablan V, Catarino A, Blackwell A. TIM: A Tool for Gaining Insights into Psychotherapy. 2019 Presented at: WWW '19: The Web Conference; May 13-17, 2019; San Francisco, CA. [CrossRef]
- Flaherty H. Teaching Note—Using Technology to Enhance Experiential Learning Through Simulated Role Plays. J Soc Work Educ 2022 May 05:1-7 [FREE Full text] [CrossRef]
- Sadeh-Sharvit S, Hollon SD. Leveraging the Power of Nondisruptive Technologies to Optimize Mental Health Treatment: Case Study. JMIR Ment Health 2020 Nov 26;7(11):e20646 [FREE Full text] [CrossRef] [Medline]
- Hutto C, Gilbert E. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. 2014 Presented at: Eighth International AAAI Conference on Weblogs and Social Media; June 1-4, 2014; Ann Arbor, MI URL: https://ojs.aaai.org/index.php/ICWSM/article/view/14550
- Elbagir S, Yang J. Sentiment Analysis on Twitter with Python’s Natural Language Toolkit and VADER Sentiment Analyzer. 2020 Presented at: International MultiConference of Engineers and Computer Scientists (IMECS 2019); March 13-15, 2019; Hong Kong URL: https://doi.org/10.1142/9789811215094_0005 [CrossRef]
- Speers A, Bhullar N, Cosh S, Wootton B. Correlates of therapist drift in psychological practice: A systematic review of therapist characteristics. Clin Psychol Rev 2022 Apr;93:102132 [FREE Full text] [CrossRef] [Medline]
- Janse P, De Jong K, Van Dijk MK, Hutschemaekers G, Verbraak MJPM. Improving the efficiency of cognitive-behavioural therapy by using formal client feedback. Psychother Res 2017 Sep;27(5):525-538 [FREE Full text] [CrossRef] [Medline]
- Blonde L, Khunti K, Harris S, Meizinger C, Skolnik N. Interpretation and Impact of Real-World Clinical Data for the Practicing Clinician. Adv Ther 2018 Nov;35(11):1763-1774 [FREE Full text] [CrossRef] [Medline]
- Gonzalez Salas Duhne P, Horan A, Ross C, Webb T, Hardy G. Assessing and promoting the use of implementation intentions in clinical practice. Soc Sci Med 2020 Nov;265:113490 [FREE Full text] [CrossRef] [Medline]
|AI: artificial intelligence|
|CBT: cognitive behavioral therapy|
|EBP: evidence-based practice|
|EBT: evidence-based therapy|
|EHR: electronic health record|
|ML: machine learning|
|NLP: natural language processing|
|VADER: Valence Aware Dictionary and sEntiment Reasoner|
Edited by A Mavragani; submitted 25.05.22; peer-reviewed by Z Dai, P Kaur, H Zhang; comments to author 29.06.22; revised version received 20.07.22; accepted 20.07.22; published 16.08.22Copyright
©Shiri Sadeh-Sharvit, Simon A Rego, Samuel Jefroykin, Gal Peretz, Tomer Kupershmidt. Originally published in JMIR Formative Research (https://formative.jmir.org), 16.08.2022.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.