This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on http://formative.jmir.org, as well as this copyright and license information must be included.
Over 100 million Americans lack affordable access to behavioral health care. Among these, military veterans are an especially vulnerable population. Military veterans require unique behavioral health services that can address military experiences and challenges transitioning to the civilian sector. Real-world programs to help veterans successfully transition to civilian life must build a sense of community, have the ability to scale, and be able to reach the many veterans who cannot or will not access care. Digitally based behavioral health initiatives have emerged within the past few years to improve this access to care. Our novel behavioral health intervention teaches mindfulness-based cognitive behavioral therapy and narrative therapy using peer support groups as guides, with human-facilitated asynchronous online discussions. Our study applies natural language processing (NLP) analytics to assess effectiveness of our online intervention in order to test whether NLP may provide insights and detect nuances of personal change and growth that are not currently captured by subjective symptom measures.
This paper aims to study the value of NLP analytics in assessing progress and outcomes among combat veterans and military sexual assault survivors participating in novel online interventions for posttraumatic growth.
IBM Watson and Linguistic Inquiry and Word Count tools were applied to the narrative writings of combat veterans and survivors of military sexual trauma who participated in novel online peer-supported group therapies for posttraumatic growth. Participants watched videos, practiced skills such as mindfulness meditation, told their stories through narrative writing, and participated in asynchronous, facilitated online discussions with peers. The writings, including online postings, by the 16 participants who completed the program were analyzed after completion of the program.
Our results suggest that NLP can provide valuable insights on shifts in personality traits, personal values, needs, and emotional tone in an evaluation of our novel online behavioral health interventions. Emotional tone analysis demonstrated significant decreases in fear and anxiety, sadness, and disgust, as well as increases in joy. Significant effects were found for personal values and needs, such as needing or desiring closeness and helping others, and for personality traits of openness, conscientiousness, extroversion, agreeableness, and neuroticism (ie, emotional range). Participants also demonstrated increases in authenticity and clout (confidence) of expression. NLP results were generally supported by qualitative observations and analysis, structured data, and course feedback.
The aggregate of results in our study suggest that our behavioral health intervention was effective and that NLP can provide valuable insights on shifts in personality traits, personal values, and needs, as well as measure changes in emotional tone. NLP’s sensitivity to changes in emotional tone, values, and personality strengths suggests the efficacy of NLP as a leading indicator of treatment progress.
The lifetime risk of acquiring a mental illness diagnosis is 50%, yet over 100 million Americans lack affordable access to effective behavioral health care [
With or without a mental health diagnosis, most returning veterans struggle with the transition to civilian life, suffering painful and sometimes debilitating symptoms even if they do not meet full criteria for a diagnosis [
Real-world programs to help veterans successfully transition to civilian life must reach the many veterans who cannot or will not access care. These programs must have the ability to scale, as there is a shortage of services available to veterans, and they must address the loneliness that veterans experience while helping them reestablish their sense of mission, purpose, and connectedness. Therefore, successful interventions must also build a sense of community.
In an effort to expand access to care, digitally based behavioral health initiatives have emerged within the past few years. Most mental illnesses, including anxiety, depression, PTSD, and addiction, are effectively treated by cognitive behavioral therapy (CBT) [
Historically, the standard of assessing efficacy of a therapeutic intervention has been through administration of subjective symptom measures at various time points (usually pretreatment and posttreatment) [
The purpose of our study was to evaluate the feasibility and utility of NLP in evaluating change in a small pilot study of 2 veteran populations who completed 2 novel online behavioral health interventions.
A total of 23 participants were recruited for 2 studies of veterans, the Next Mission (NM) program and Women Warriors (WW) program. A total of 13 participants enrolled in the 14-week NM program, and 10 enrolled in the 8-week WW program. Inclusion criteria for the study were: (1) must be a military veteran, (2) must be aged 18 or older, (3) must speak English, (4) must be able to access the internet regularly, and (5) for the WW program, must be a woman. In the NM program, 11 of the 13 (85%) participants were men, while the WW program consisted of all (10/10, 100%) female participants. In total, 16 participants completed the courses, 9 in the NM program and 7 in the WW program.
Participants were recruited via Facebook and LinkedIn ads and could sign up on a mobile, laptop, or desktop device of their choice. Recruitment was conducted over several months until enough participants were able to form cohorts for each course. The recruitment messages attempted to account for potential stigma by focusing on building resiliency skills and promoting posttraumatic growth rather than treating mental illness. Participants were informed that they would be helping others in their group while also getting help for themselves.
Participants were also incentivized by the opportunity to earn University of California college credits for their work, although only 3 of 16 (19%) participants took advantage. Participants understood that the program was fully compliant to the Health Insurance Portability and Accountability Act (HIPAA) and that they could choose to participate anonymously. However, they also had the option of revealing their identity to the group at any time they wished during the course. Both programs, which we also refer to as courses, were entitled “Stress, Resiliency and Post Traumatic Growth.” The constellation of the groups was generationally diverse and allowed older veterans to connect with younger veterans
In each program, participants watched videos teaching principles of CBT, narrative therapy, behavioral activation, and mindfulness meditation. They submitted written homework, including journal entries and thought and mood logs, and participated in asynchronous discussions, all within a HIPAA-secure environment. Approximately 90 minutes of online class time and one hour per week of homework was completed by each participant. The asynchronous discussions were facilitated and monitored by a licensed, doctoral-level therapist. Of note, facilitators for the WW group were female. The content for the programs was created in various commercially available proprietary applications by faculty members in the Department of Psychiatry of a US University Medical Center and then assembled into the NM and WW programs and delivered on a commercially available proprietary platform.
Linguistic Inquiry and Word Count (LIWC) is software designed to analyze word use within written text. It calculates the percentage of usage for sets of words, arranging them in 80 linguistic categories and generating output statistics for each of the categories [
IBM Watson is a computer system that uses artificial intelligence to interpret unstructured data within natural language. IBM Watson’s Personality Insights is programmed to analyze natural language input and provide outputs of personality characteristics based on 3 models [
Definitions of personal needs and values outputs.a
Characteristic | Description | |
|
|
|
|
Excitement | Emphasizes importance of getting out and living life, oriented toward having fun |
|
Harmony | Appreciation for other people, their viewpoints, or feelings |
|
Curiosity | Seeking discovery and desire for personal growth |
|
Ideal | Wanting perfection and seeking sense of community |
|
Closeness | Valuing connectedness with others |
|
Self-expression | Emphasizes the importance of expressing oneself and asserting individual identities |
|
Liberty | Have a desire for fashion and new things, as well as the need for escape and freedom |
|
Love | Valuing social contact, either one-to-one or one-to-many |
|
Practicality | Having a desire to accomplish things, a desire for skill and efficiency, including physical expression and experience |
|
Stability | Valuing sensibility, equivalence, and balance |
|
Challenge | Having desire to succeed and take on challenges |
|
Structure | Exhibit a grounded trait and a desire to hold things together. They need things to be well organized and under control |
|
|
|
|
Helping others | Showing concern for the welfare and interests of others |
|
Tradition | Emphasizes self-restriction, order, and resistance to change |
|
Life pleasure | Seek pleasure and sensuous gratification for themselves |
|
Achievement | Seek personal success for themselves |
|
Excitement | Emphasize independent action, thought, and feeling, as well as a readiness for new experiences |
aTable content was taken and aggregated from IBM Watson Personality Insights [
IBM Watson Tone Analyzer is an artificial intelligence–enabled text analysis tool produced by IBM Watson that uses AI to infer emotional tone through written text. Tone Analyzer is based on psycholinguistics theory and examines how day-to-day word usage correlates to manifest emotions [
Participants reported feedback from the course and filled out structured subjective symptom questionnaires. Structured data were not analyzed quantitatively in comparison with NLP due to variations of sample size and low completion rates. As such, quantitative changes in structured data were observed qualitatively. Facilitators also reported subjective qualitative observations of participant progress. Structured measures included the Positive States of Mind Scale (PSOM) [
Participants’ writing samples and online posts were analyzed using the LIWC [
In total, 16 of the 23 (70%) participants completed the courses and 15 were able to have their unstructured data analyzed. Results of all rMANOVA analyses on NLP outputs are summarized in
Results of natural language analysis for personality traits, values, emotional tone, and Linguistic Inquiry and Word Count for treatment groups of Next Mission combat veterans and Women Warriors military sexual assault survivors.a
Characteristic | NMb combat veterans (n=9) | WWc military sexual assault survivors (n=6) | ||||||||
Cohen |
Effect size (partial η2) | Cohen |
Effect size (partial η2) | |||||||
|
|
|
|
|
|
|
|
|
||
|
Openness | 6.372 (1,8) | 1.34 | .04 | 0.443 | 9.740 (1,5) | 1.47 | .03 | 0.661 | |
|
Agreeableness | 10.305 (1,8) | 1.77 | .01 | 0.777 | 12.211 (1,5) | 1.68 | .02 | 0.709 | |
|
Extroversion | 12.446 (1,8) | 1.52 | .001 | 0.609 | 7.105 (1,5) | –1.58 | .045 | 0.587 | |
|
Conscientiousness | 27.923 (1,8) | –2.83 | .001 | 0.563 | 1.929 (1,5) | –0.53 | .22 | 0.278 | |
|
Emotional range | 15.473 (1,8) | –1.74 | .004 | 0.659 | 1.1014 (1,5) | 0.58 | .36 | 0.169 | |
|
||||||||||
|
Curiosity | 3.227 (1,8) | 0.84 | .11 | 0.287 | 5.971 (1,5) | 1.44 | .06 | 0.544 | |
|
Harmony | 7.787 (1,8) | 1.42 | .02 | 0.493 | N/Af | N/A | N/A | N/A | |
|
Structure | 1.865 (1,8) | –0.44 | .21 | 0.189 | 2.784 (1,5) | 0.87 | .16 | 0.358 | |
|
Closeness | 17.672 (1,8) | 2.23 | .003 | 0.688 | 10.006 (1,5) | 1.23 | .03 | 0.667 | |
|
Stability | 6.518 (1,8) | 1.49 | .03 | 0.449 | N/A | N/A | N/A | N/A | |
|
Helping others | 21.715 (1,8) | 2.06 | .002 | 0.731 | 3.077 (1,5) | 0.35 | .14 | 0.381 | |
|
Excitement | 23.635 (1,8) | 2.34 | .001 | 0.747 | 0.391 (1,5) | 0.15 | .56 | 0.072 | |
|
Life pleasure | 18.926 (1,8) | 1.92 | .002 | 0.703 | 8.511 (1,5) | 1.30 | .03 | 0.630 | |
|
Tradition | 9.005 (1,8) | –1.54 | .02 | 0.530 | 1.238 (1,5) | 0.12 | .32 | 0.198 | |
|
Achievement | 9.414 (1,8) | 1.01 | .02 | 0.541 | 7.294 (1,5) | 1.28 | .04 | 0.593 | |
|
Love | N/A | N/A | N/A | N/A | 13.474 (1,5) | 1.58 | .01 | 0.729 | |
|
Ideal | N/A | N/A | N/A | N/A | 3.173 (1,5) | 1.03 | .14 | 0.388 | |
|
||||||||||
|
Sadness | 10.852 (1,8) | –1.01 | .01 | 0.576 | 1.164 (1,6) | –0.20 | .33 | 0.283 | |
|
Disgust | 7.660 (1,8) | –1.25 | .02 | 0.489 | 1.413 (1,5) | –0.74 | .29 | 0.220 | |
|
Joy | 11.017 (1,8) | 1.81 | .01 | 0.579 | 1.977 (1,5) | 0.84 | .22 | 0.283 | |
|
Fear | 0.114 (1,8) | –0.17 | .74 | 0.014 | 4.365 (1,5) | –0.80 | .09 | 0.466 | |
|
Anger | 0.542 (1,8) | –0.32 | .48 | 0.063 | 0.518 (1,5) | –0.40 | .50 | 0.094 | |
|
||||||||||
|
Analytical thinking | 0.229 (1,8) | 0.20 | .65 | 0.028 | 0.094 (1,5) | –0.12 | .77 | 0.018 | |
|
Authenticity | 7.326 (1,8) | 0.92 | .03 | 0.478 | 5.457 (1,5) | 0.37 | .07 | 0.522 | |
|
Clout | 8.651 (1,8) | 0.90 | .02 | 0.520 | 7.920 (1,5) | 1.47 | .04 | 0.613 | |
|
Emotional tone | 6.016 (1,8) | 0.92 | .04 | 0.429 | 4.597 (1,5) | 1.17 | .09 | 0.479 |
aN=15.
bNM: Next Mission.
cWW: Women Warriors.
d
eFor each program group, 10 identified values were outputted by IBM Watson Personality Insights based on highest density, thus differ slightly between programs.
fN/A: not applicable.
gLIWC: Linguistic Inquiry and Word Count.
A total of 9 of the 13 (70%) Next Mission participants completed the course and were able to have their unstructured data analyzed. Results of analysis on NLP data were grouped into 4 outcome variable groups: (1) personality traits, (2) personal values and needs, (3) emotional tone, and (4) LIWC. Results from rMANOVA were reported as univariate analyses. Cohen
Participants showed significant increases in openness (
Participants showed significant increases in the values and needs of helping others (
Curiosity (
Participants showed significant decreases in sadness (
Anger (
The application of LIWC tools showed significant improvement in authenticity, clout, and emotional tone. Authenticity (
Observations were derived from structured data completed by 3 of the participants (results found in
The site made me feel calm and cared for. It made me feel like people were looking out for me and that I had a network of resources and people if I needed.
I got my laughter back.
In total, 7 of the 10 (70%) Women Warriors participants completed the course. Of these, 6 WW participants produced sufficient narrative to have their unstructured data analyzed. Results of analysis of unstructured data were grouped into 4 outcome variable groups: (1) personality traits, (2) personal values and needs, (3) emotional tone, and (4) LIWC. Results from rMANOVA were reported as univariate analyses. Cohen
Participants showed significant increases in personality traits of openness (
Conscientiousness (
Participants showed significant increases in the personal values and needs of closeness (
The values and needs domains of helping others (
None of the differences of emotional tone were statistically significant. Fear (
The application of LIWC tools showed significant improvement in clout (
Qualitative observations of structured data (results found in
I believe [course facilitator] when you say WE WILL get to the place of balance and peace. I returned home and I feel better than I have in a long time. I still have challenges and I know it's a day to day journey, but I ACTUALLY feel stronger…I'm honestly still in shock. I didn't think recovery would ever be a word used to describe me, but now I'm believing it.
The good thing that came from this week's assignment was I found that I found support that I didn't realize that I had from sources I never would've thought.
[The course platform] made me feel calm and cared for. It made me feel like people were looking out for me and that I had a network of resources and people if I needed.
Averages and change scores for treatment groups of Next Mission combat veterans and Women Warriors military sexual assault survivors.
Assessment | NMa combat veterans (n=3) | WWb military sexual assault survivors (n=7) | |||||
Pretest, mean | Posttest, mean | Difference scorec, mean | Pretest, mean | Posttest, mean | Difference scorec, mean | ||
|
47.3 | 58.7 | 11.4 | 66.6 | 60.6 | –6.0 | |
|
PTGI-I: relating to others | 10.7 | 17.0 | 6.3 | 21.0 | 18.6 | –2.4 |
|
PTGI-II: new possibilities | 13.0 | 14.7 | 1.7 | 17.0 | 1.7 | –15.3 |
|
PTGI-III: personal strength | 11.7 | 12.7 | 1.0 | 14.6 | 10.6 | –4.0 |
|
PTGI-IV: spiritual change | 3.0 | 3.7 | 0.7 | 2.6 | 2.7 | 0.1 |
|
PTGI-V: appreciation of life | 9.0 | 10.7 | 1.7 | 8.6 | 8.7 | 0.1 |
SWEMWSd | N/Ae | N/A | N/A | 21.1 | 22.2 | 1.1 | |
|
N/A | N/A | N/A | 36.3 | 30.3 | –6.0 | |
|
Active coping | N/A | N/A | N/A | 3.1 | 2.6 | –0.5 |
|
Positive reframing | N/A | N/A | N/A | 3.3 | 2.9 | –0.4 |
|
Plan | N/A | N/A | N/A | 3.6 | 2 | –1.6 |
|
Emotional support | N/A | N/A | N/A | 2.6 | 3 | 0.4 |
|
Self-distraction | N/A | N/A | N/A | 3.3 | 2.7 | –0.6 |
|
Vent | N/A | N/A | N/A | 2.9 | 2.1 | –0.8 |
|
Behavioral disengagement | N/A | N/A | N/A | 1.9 | 0.7 | –1.2 |
|
Acceptance | N/A | N/A | N/A | 1.9 | 3.4 | 1.5 |
|
Humor | N/A | N/A | N/A | 2.9 | 1.9 | –1.0 |
|
Religion | N/A | N/A | N/A | 3.1 | 2.3 | –0.8 |
|
Instrumental support | N/A | N/A | N/A | 2.3 | 3.6 | 1.3 |
|
Denial | N/A | N/A | N/A | 0.9 | 0.4 | –0.5 |
|
Substance use | N/A | N/A | N/A | 1.1 | 0.9 | –0.2 |
|
Self-blame | N/A | N/A | N/A | 2.0 | 1.8 | –0.2 |
PCL-5g | N/A | N/A | N/A | 57.6 | 60.9 | 3.3 | |
PHQh-15 | N/A | N/A | N/A | 13.6 | 10.9 | –2.7 | |
PHQ-9 | N/A | N/A | N/A | 13.4 | 10.7 | –2.7 | |
GAD-7i | N/A | N/A | N/A | 11.4 | 8.3 | –3.1 | |
|
9.0 | 9.0 | 0.0 | N/A | N/A | N/A | |
|
Focused attention | 1.7 | 1.7 | 0.0 | N/A | N/A | N/A |
|
Productivity | 2.0 | 1.3 | –0.7 | N/A | N/A | N/A |
|
Responsible caretaking | 1.0 | 2.0 | 1.0 | N/A | N/A | N/A |
|
Restful repose | 0.7 | 1.3 | 0.6 | N/A | N/A | N/A |
|
Sensuous pleasure | 2.0 | 1.0 | –1.0 | N/A | N/A | N/A |
|
Sharing | 1.7 | 1.7 | 0.0 | N/A | N/A | N/A |
aNM: Next Mission.
bWW: Women Warriors.
cDifferences in scores were used for qualitative observation. Differences in scores were not calculated for significance due to variation in completion and sample size.
dSWEMWS: Short Warwick-Edinburgh Mental Well-Being Scale.
eN/A: not applicable.
fCOPE: Coping Orientation to Problems Experienced.
gPCL-5: PTSD checklist for the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition.
hPHQ: Patient Health Questionnaire.
iGAD-7: generalized anxiety disorder 7-item checklist.
jPSOMS: Positive States of Mind Scale (n=10).
The purpose of our study was to evaluate the feasibility and utility of NLP in evaluating change in a small pilot study of 2 veteran populations who completed 2 novel online behavioral health interventions. Our aggregate results suggest that NLP can provide valuable insights on shifts in personality traits, personal values, and needs, as well as measure changes in emotional tone, in an evaluation of our novel online behavioral health interventions. The process of our participant recruitment suggests additional support for our findings. Recruitment was conducted over several months until critical mass was obtained for a cohort, with participants waiting variable amounts of time until start of the courses. Participants, regardless of waiting time, began the courses with similar symptom profiles. However, although our effect sizes are large, our small sample size must be considered in the interpretation of our results. Additionally, although it is reasonable to conclude that the large effect sizes seen are the result of our intervention, a limitation of our study was that we were unable to account for potential confounds by comparison with a control group. Thus, we cannot say with absolute certainty whether the effects seen were due to our intervention or additional factors. Future studies should seek to replicate the effects found in our pilot study with a control or comparison group, accounting for potential confounds.
In general, results of NLP mirrored qualitative reports and feedback from participants and facilitators on the efficacy of these interventions. Furthermore, our study suggests that NLP may detect participant change with a greater sensitivity than that of subjective symptom measures. We note that checklists do not tell stories—only stories do. A common clinical observation is the phenomenon in treatment of depression or anxiety wherein an individual’s loved ones or a clinician will often notice change in the individual before the individual become self-aware of the change [
Findings such as decreased fear, sadness, disgust, and emotional range (neuroticism) with increased joy are consistent with posttraumatic growth and specifically tend to be associated with resilience. A common phenomenon in combat veterans is survivor’s guilt, more conventionally termed moral injury [
A noteworthy finding of our study is revealed by the consistent increases in closeness, love, and life pleasure in WW military sexual assault survivors. PTSD research indicates that these domains are severely impacted by trauma in general and sexual assault in particular [
There is tentative evidence to suggest that the personality trait of conscientiousness is related to hypervigilance in decision making [
A possible explanation for increased need for stability (eg, a consumer who consistently likes the same choice in a product, not a variety of choices) is that as veterans connect and create community with each other, there may be a reflection back to shared military values. Stability and regimen are a hallmark of military culture [
Further studies are needed and are underway to determine whether a higher rate of completion of subjective symptom measures will correspond to changes found with NLP. These studies are also needed to determine whether NLP changes will accurately predict future changes in subjective symptom measures.
We recognize that with the attributes being measured by these tools, aggregating data is not always the best way to assess whether the impact of an intervention on an individual is positive, negative, or neutral. For example, authenticity and confidence in expression are generally understood in terms of “the more the better,” whereas for attributes like conscientiousness, the desired outcome would be more for someone who is irresponsible and less for someone who is overly obsessive or rigid. Future research should focus on evaluating these online behavioral health interventions with larger samples across different populations while also measuring effectiveness in comparing natural language analytics to conventional evaluation methods.
A limitation of our study is that the generalizability is impacted by small sample sizes and the lack of a control group. Thus, we are unable to confirm whether results are due to the online intervention or another unidentified factor or set of factors. Furthermore, despite a growing number of validation studies, NLP is not widely accepted or implemented as a reliable indicator of therapeutic change. Thus, further validation studies that establish convergent and discriminant validity with therapeutic outcomes are warranted. An additional limitation of our study is that we were unable to statistically compare outputs from NLP to validated structured measures. The extant validation literature suggests that attempts at finding convergence of NLP with self-report measures of personality have produced mixed results. This could be due to the fact that NLP tends to measure both latent and explicit emotional tone and personality, whereas self-report measures are solely reliant on the perception of the individual. As a result, both methods have accrued criticism for being affected by bias [
Our interventions were impactful on attributes detected in writing, and the results of NLP provided tentative yet potentially valuable and provocative insights. By quantifying and aggregating these attributes, we have gained insights about which areas of emotional functioning are responsive to our intervention. By looking at individual analyses, we can readily see how each participant is progressing, for example by noting reduced emotional expression in someone who was emotionally dysregulated and increased emotional expression in another participant who was initially emotionally constricted. The use of natural language analytics tools opens up a completely new area of scientific inquiry. We are getting closer to entirely replicating what happens in in-person psychotherapy. We can now provide the benefit of both symptom checklists and patient narratives to expert clinicians who know how to interpret the data for clinical decision making and to researchers who can determine the impact and value of any given behavioral health intervention. We believe that using AI powered by natural language analytics will enable the creation of effective therapy bots that will assist facilitators and sustain participant engagement, as this intervention is scaled to make it accessible to everyone, anytime, anywhere. We also believe that using NLP applied to behavioral health interventions and other clinical situations creates an entirely new field of medical informatics.
artificial intelligence
cognitive behavioral therapy
Coping Orientation to Problems Experienced
Health Insurance Portability and Accountability Act
Linguistic Inquiry and Word Count
natural language processing
Next Mission
PTSD checklist for the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition
Patient Health Questionnaire
Positive States of Mind Scale
Posttraumatic Growth Inventory
posttraumatic stress disorder
repeated-measures multivariate analysis of variance
Short Warwick-Edinburgh Mental Well-Being Scale
US Department of Veterans Affairs
Women Warriors
The NM and WW programs were approved by the institutional review board of the University of California, San Francisco. The NM program was funded by Bristol-Myers Squibb Foundation Inc, a grant and gifts from the McCormick and the McKesson foundations, and individual donors. The WW program was funded by gifts by the Gallo Foundation and Veterans on Wall Street.
All authors reviewed and edited the manuscript and approved of the final draft.
KN has a household interest in Tiatros Inc. KN is an advisor to Tiatros compensated by stock options. DWK is a project-based contractor with Tiatros Inc. WK is an employee of IBM. AG is an employee of IBM.