This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
Heart failure (HF) is a major cause of frequent hospitalization and death. Early detection of HF symptoms using smartphone-based monitoring may reduce adverse events in a low-cost, scalable way.
We examined the relationship of HF decompensation events with smartphone-based features derived from passively and actively acquired data.
This was a prospective cohort study in which we monitored HF participants’ social and movement activities using a smartphone app and followed them for clinical events via phone and chart review and classified the encounters as compensated or decompensated by reviewing the provider notes in detail. We extracted motion, location, and social interaction passive features and self-reported quality of life weekly (active) with the short Kansas City Cardiomyopathy Questionnaire (KCCQ-12) survey. We developed and validated an algorithm for classifying decompensated versus compensated clinical encounters (hospitalizations or clinic visits). We evaluated models based on single modality as well as early and late fusion approaches combining patient-reported outcomes and passive smartphone data. We used Shapley additive explanation values to quantify the contribution and impact of each feature to the model.
We evaluated 28 participants with a mean age of 67 years (SD 8), among whom 11% (3/28) were female and 46% (13/28) were Black. We identified 62 compensated and 48 decompensated clinical events from 24 and 22 participants, respectively. The highest area under the precision-recall curve (AUCPr) for classifying decompensation was with a late fusion approach combining KCCQ-12, motion, and social contact features using leave-one-subject-out cross-validation for a 2-day prediction window. It had an AUCPr of 0.80, with an area under the receiver operator curve (AUC) of 0.83, a positive predictive value (PPV) of 0.73, a sensitivity of 0.77, and a specificity of 0.88 for a 2-day prediction window. Similarly, the 4-day window model had an AUC of 0.82, an AUCPr of 0.69, a PPV of 0.62, a sensitivity of 0.68, and a specificity of 0.87. Passive social data provided some of the most informative features, with fewer calls of longer duration associating with a higher probability of future HF decompensation.
Smartphone-based data that includes both passive monitoring and actively collected surveys may provide important behavioral and functional health information on HF status in advance of clinical visits. This proof-of-concept study, although small, offers important insight into the social and behavioral determinants of health and the feasibility of using smartphone-based monitoring in this population. Our strong results are comparable to those of more active and expensive monitoring approaches, and underscore the need for larger studies to understand the clinical significance of this monitoring method.
Although there are numerous attempts to monitor heart failure (HF) in an outpatient setting using wearables and other point-of-care devices, compliance is often an issue and prevents monitoring for extended periods [
We defined HF decompensation status based on worsened functional symptoms or physical examination findings suggestive of lower cardiac output or increased intracardiac pressures. This includes but is not limited to fatigue, dyspnea, hypotension, and lower extremity edema [
Various studies have investigated techniques for nonintrusively monitoring patients with HF. Packer et al [
Other noninvasive approaches include patient-reported outcomes, which could be collected using clinically validated questionnaires such as the Kansas City Cardiomyopathy Questionnaire (KCCQ). The KCCQ assesses the quality of life and predicts readmissions and mortality in patients with HF [
With the advancement of technology, smartphones have become a ubiquitous part of our daily life. For long-term monitoring, using a smartphone could be advantageous to a solution requiring an additional device by reducing the disruption to patients’ normal daily routine. Our research team and collaborators have previously developed the Automated Monitoring of Symptom Severity (AMoSS) app, which is a custom and scalable smartphone-based framework for remote monitoring [
In this study, HF decompensation events were predicted from features derived from passive and active data collected by the smartphone-based framework. Features were extracted from 3 passive data modalities (motion, location, and social interactions) and 1 active (clinical survey data: short KCCQ [KCCQ-12]). Algorithms based on using a single modality and 2 sensor fusion approaches were developed. An analysis of the feature importance in the model is also presented. Finally, a novel late-fusion model that combines the KCCQ-12, motion, and social contact data is proposed.
Earlier research with the AMoSS app [
All data were deidentified at the source (on the participants’ phones) with hashed identifiers, and random geographic offsets were added to the location data to protect the participants’ privacy. The data were stored in HIPAA (Health Insurance Portability and Accountability Act)-compliant Amazon Web Services data buckets, and the phone app uploaded data periodically (based on connectivity) every few hours. Participants with HF enrolled in the ongoing study at the Veterans Affairs Medical Center and Emory University Hospital in Atlanta, GA, USA, signed a consent form prior to the beginning of the study. The study protocol was approved by the institutional review board (#00075867) at Emory University. The clinical team provided participants with an Android-based smartphone with the app installed during the enrollment. The participant could opt to stop sharing any data type during the study, using switches provided in the app.
Illustration of the study timeline. Passive data collection started after the hospital discharge, and the clinical team recorded the clinical events after the enrollment. HF: heart failure.
The data from 28 participants (25 males) who contributed at least 1 clinical event were used in this research. The inclusion criteria for participants in the study were the following: a diagnosis consistent with congestive HF as noted in the electronic medical records within the Emory Health Network, an age over 18 years, the ability to consent to a clinical study, and English as their primary language. Exclusion criteria were the following: diagnosis with a terminal illness with a life expectancy of fewer than 6 months, enrollment in a hospice program, or enrollment in a clinical study that precluded them from participating in another clinical study. Finally, participants had to be willing and able to comply with the use of their smartphones as indicated in the study.
Data set description: if the metric is not available, the participant is excluded from that row.
Participant characteristics | Values (N=28) | ||
|
|||
Age (years), mean (SD) | 67 (8) | ||
Male, n (%) | 25 (89) | ||
BMI, mean (SD) | 31 (6) | ||
Mean ejection fraction (%), mean (SD) | 35 (17) | ||
Employed, n (%) | 3 (11) | ||
|
|||
Black | 13 (46) | ||
White | 15 (54) | ||
|
|||
History of diabetes | 18 (64) | ||
Previous myocardial Infarction | 2 (7) | ||
History of hypertension | 19 (68) | ||
Previous stroke | 4 (14) | ||
Peripheral vascular disease | 2 (7) | ||
History of atrial fibrillation | 8(29) | ||
Other non–atrial fibrillation arrhythmia | 3 (1) | ||
|
|||
Compensated, n | 62 | ||
Decompensated, n | 48 | ||
Compensated per person, mean (SD) | 2 (1.8) | ||
Decompensated per person, mean (SD) | 2 (1.7) |
Clinical events consisted of decompensated and compensated events and were collected by the clinical team when the participants visited the hospitals. In the compensated events, the participants visited the hospital for any reason, and their fluid levels were determined to be normal based on the clinician assessment, which includes a history and physical examination. For the decompensated events, the clinical team determined the participant to have functional limitations related to HF. Decompensated and compensated events were assigned to positive and negative classes, respectively.
The raw 3D accelerometer data were converted to activity counts using the Actigraphy Toolbox to reduce the required memory for storing [
Social contact data included the contact identifier (ID), directionality, and the duration of each call. Each contact was anonymized and assigned a unique ID at the source (on the phone by the app). The age demographics of our population were such that social media was not uniformly used across the population [
Double plot representation of actigraphy data illustrating daily motion intensity levels for 1 participant. Darker colors indicate lower intensity movement, and the white color indicates missing data. On the top of the plot, decompensated and compensated clinical events are shown with red and orange squares, respectively. Comp: compensated; Decomp: decompensated.
Participants' social contact intensity over 300 days. Each unique contact is assigned a number as shown in the y-axis, and the circle radius is proportional to the call duration to each ID. On the top of the plot, decompensated and compensated clinical events are shown with red and orange squares, respectively. Comp: compensated; Decomp: decompensated.
Location data collected in compensated (comp.) and decompensated (decomp.) windows for a participant shown on the same map with 50 km × 50 km dimensions.
Kernel density estimate for the location data of 1 participant.
The active data type, which required user input, was the KCCQ administered through the smartphone app. The scores are lower for severe HF symptoms, and KCCQ scores ≤25 correspond to New York Heart Association class IV. In this study, we used the shorter version of the questionnaire, referred to as the KCCQ-12 [
KCCQ-12 summary score over days for a particular participant. A KCCQ-12 score ≤25 indicates a transition to severe HF. Decompensated and compensated clinical events are shown with red and orange squares above the plot, respectively. Comp: compensated; Decomp: decompensated; HF: heart failure; KCCQ-12: short Kansas City Cardiomyopathy Questionnaire.
Several features were extracted for a particular time window from the data collected through the app to construct the motion feature set. A time window of data was the N day period before a clinical event, and the feature extraction was performed for each time window. The window size N was chosen to be 14 days initially since it was also selected by the developers of KCCQ-12 to represent the participant’s recent functioning [
Using the participant’s location data, the most frequently visited location was determined and defined as the “home” location. The number of times the participant was at the home location was calculated and used as a feature (
From the KCCQ-12 data, 2 different sets of features were investigated. First, the summation score (
Logistic regression classifiers were trained to map the feature vector to the compensated or decompensated outcome. All the models were written in Python 3 language (The Python Software Foundation), and the programming code was based on scikit-learn [
Since the number of compensated and decompensated events were highly imbalanced (
Both early and late fusion approaches combined passive and active modalities (
Modality fusion techniques. Purple and red colors indicate 2 different modalities. The left side (a) shows the early fusion approach, and the right side (b) shows the late fusion of the modalities. comp: compensated; decomp: decompensated.
To examine and interpret the features further, Shapley additive explanation (SHAP) values for the early fusion model were calculated [
Finally, we investigated how early the models can predict an outcome by implementing a time-to-event analysis and a window size analysis. The time-to-event methodology consisted of analyzing the performance of a model using data from only 1 day prior to the event but shifting which day is included in the analysis. The window size methodology consisted of analyzing different intervals of days prior to the event and evaluating the model performance on each window.
The cross-validation performance for each single-modality model (motion, location, and social contact) is shown in
Passive data model performance results presented as the mean and SD of the external folds of each experiment.
Modality | Accuracy, mean (SD) | AUCa, mean (SD) | AUCPrb, mean (SD) | PPVc, mean (SD) | TPRd, mean (SD) |
Motion | 0.66 (0.03) | 0.66 (0.03) | 0.60 (0.06) | 0.55 (0.04) | 0.61 (0.06) |
Location | 0.59 (0.07) | 0.56 (0.10) | 0.39 (0.11) | 0.34 (0.10) | 0.49 (0.17) |
Social | 0.58 (0.05) | 0.65 (0.05) | 0.56 (0.06) | 0.46 (0.06) | 0.60 (0.07) |
aAUC: area under the curve of the receiver operator curve.
bAUCPr: area under the precision-recall curve.
cPPV: positive predictive value.
dTPR: true positive rate.
Active data single modality model performance reported as the mean and SD of the external folds of each experiment.
Modality | Accuracy, mean (SD) | AUCa, mean (SD) | AUCPrb, mean (SD) | PPVc, mean (SD) | TPRd, mean (SD) | |
|
||||||
KCCQ-12sume | 0.64 (0.01) | 0.75 (0.01) | 0.61 (0.02) | 0.55 (0.01) | 0.66 (0.03) | |
KCCQ-12allf | 0.65 (0.02) | 0.67 (0.02) | 0.54 (0.04) | 0.57 (0.02) | 0.69 (0.04) | |
|
||||||
KCCQ-12sum | 0.69 (0.01) | 0.77 (0.01) | 0.69 (0.02) | 0.61 (0.02) | 0.71 (0.03) | |
KCCQ-12all | 0.69 (0.03) | 0.70 (0.01) | 0.61 (0.04) | 0.60 (0.02) | 0.74 (0.04) |
aAUC: area under the curve of the receiver operator curve.
bAUCPr: area under the precision-recall curve.
cPPV: positive predictive value.
dTPR: true positive rate.
eKCCQ-12all: set of features for each short Kansas City Cardiomyopathy Questionnaire survey domain separately.
fKCCQ-12sum: summation scores for all short Kansas City Cardiomyopathy Questionnaire survey domains.
For the fusion model which combines KCCQ-12 and motion data, 17 participants contributed data for both modalities, with 21 decompensated events and 26 compensated events. When 3 modalities were used (KCCQ-12, motion, and social contact), 16 participants contributed with 18 decompensated events and 21 compensated events. Finally, when all data types were merged (KCCQ-12, motion, social contact, and location), there were data available for 12 participants, with 10 decompensated events and 18 compensated events.
The results for the early fusion models are shown in
Results of early fusion models reported as the mean and SD of the external folds of each experiment.
Modality | Accuracy, mean (SD) | AUCa, mean (SD) | AUCPrb, mean (SD) | PPVc, mean (SD) | TPRd, mean (SD) |
Motion + social | 0.62 (0.04) | 0.58 (0.03) | 0.54 (0.04) | 0.53 (0.05) | 0.53 (0.06) |
KCCQ-12e + motion | 0.73 (0.02) | 0.81 (0.01) | 0.75 (0.03) | 0.69 (0.02) | 0.73 (0.05) |
KCCQ-12 + motion + social | 0.71 (0.04) | 0.72 (0.05) | 0.69 (0.06) | 0.70 (0.04) | 0.66 (0.09) |
KCCQ-12 + motion + social + location | 0.67 (0.05) | 0.64 (0.07) | 0.57 (0.11) | 0.55 (0.07) | 0.56 (0.09) |
aAUC: area under the curve of the receiver operator curve.
bAUCPr: area under the precision-recall curve.
cPPV: positive predictive value.
dTPR: true positive rate
eKCCQ-12: the short Kansas City Cardiomyopathy Questionnaire survey.
Results of late fusion models reported as the mean and SD of the external folds of each experiment.
Modality | Accuracy, mean (SD) | AUCa, mean (SD) | AUCPrb, mean (SD) | PPVc, mean (SD) | TPRd, mean (SD) |
Motion + social | 0.64 (0.03) | 0.63 (0.04) | 0.52 (0.05) | 0.54 (0.04) | 0.56 (0.07) |
KCCQ-12e + motion | 0.67 (0.03) | 0.75 (0.02) | 0.67 (0.04) | 0.61 (0.03) | 0.72 (0.07) |
KCCQ-12 + motion + social | 0.71 (0.04) | 0.79 (0.03) | 0.77 (0.04) | 0.68 (0.04) | 0.70 (0.05) |
KCCQ-12 + motion + social + location | 0.62 (0.07) | 0.72 (0.07) | 0.60 (0.11) | 0.49 (0.07) | 0.68 (0.10) |
aAUC: area under the curve of the receiver operator curve.
bAUCPr: area under the precision-recall curve.
cPPV: positive predictive value.
dTPR: true positive rate.
eKCCQ-12: the short Kansas City Cardiomyopathy Questionnaire survey.
SHAP summary plot for the early fusion model. Features are sorted by their impact on the y-axis. Each point on the plot shows the Shapley value for 1 instance. The horizontal location shows the feature’s effect for predicting positive class (decompensated) or negative class (compensated), and color indicates the feature value. SHAP: Shapley additive explanation.
We investigated how early the algorithms can predict an outcome by shifting the days to the event and using different window sizes in days for each model in each category.
Performance changes as the days to events are shifted. The x-axis indicates the time to event in days, and the y-axis indicates the AUC and AUCPr performance. Early fusion and late fusion models combine KCCQ-12, motion, and social contact modalities. AUC: area under the curve of the receiver operator curve; AUCPr: area under the precision-recall curve; fus: fusion; KCCQ-12: the shot Kansas City Cardiomyopathy Questionnaire.
Performance changes as the window size is reduced. The x-axis indicates the time to event in days and the y-axis indicates the AUC and AUCPr performance. Early and late fusion models use KCCQ-12, motion, and social contact modalities. AUC: area under the curve of the receiver operator curve; AUCPr: area under the precision-recall curve; fus: fusion; KCCQ-12: the shot Kansas City Cardiomyopathy Questionnaire; win: window.
In this proof-of-concept study that involved tracking HF status with smartphone technologies, we showed that it is feasible to collect information from self-reported surveys and passive monitoring that are clinically relevant in classifying compensated versus decompensated status. This study is a first of its kind to evaluate 3 passive data modalities (motion, location, and social interactions) and 1 active data modality, the KCCQ-12 survey. We tested both individual and combined active and passive metrics, and showed that each of them individually and in combination may be potentially useful in helping predict HF decompensation up to 6 days in advance of the clinical encounter.
Next-day prediction algorithms were built using each modality separately. From the passive data sources, the motion data–based model achieved the highest AUCPr of 0.60. For a model based only on the responses of the KCCQ-12, using the summary of all domains and using the most recent score resulted in the best performance with an AUCPr of 0.74 (
When different time-to-event horizons were tested, a general trend of lower performance for longer future predictions was observed. This was expected since symptoms are likely to become more pronounced closer to the event. However, predictions 2 days ahead were actually better than those 1 day ahead, and the performance 4 days ahead was almost as good as that 1 day before the event. This indicates that 1-day, 2-day, and 4-day models could be run simultaneously to identify short- and medium-term risks and result in different levels of intervention. Changes in performance will be affected by the levels of missingness as the event approaches, as well as the intrinsic behaviors, which may explain the performance of the 2-day window.
Our proof-of-concept study suggests that low-burden, smartphone-based methods of monitoring in HF may offer modest incremental predictive value. The accuracy of our models was similar to earlier work that used mobile health sensors [
There are several key limitations to this study. First, when the data were missing, the app did not indicate whether this resulted from the participant closing the app voluntarily or if it resulted from the smartphone battery running out. These behaviors have different etiologies, which may be related to impending decompensation in different ways. For example, closing the app may indicate being tired, whereas a battery running out of charge may indicate apathy connected with depression. If an additional label is collected for missing sections, it could be used to learn other behavioral patterns. Second, text messages and social media can provide a more complete picture on social contact. However, due to the age demographics of our population, social contact was quantified using only phone call information [
Our proposed novel smartphone-based approach for noninvasively monitoring patients with HF may help monitor health status changes through changes in movement, location, social interactions, or a combination of these. Many of these features are new discoveries and suggest important mechanisms of disease that have previously been less explored. Due to the ubiquity of smartphones and the ease of scalability of the framework, our method has the potential to facilitate low-cost monitoring of large populations. However, we note that this is a preliminary study on a relatively small population, and before it can be validated, a larger study is required. In addition, other passive monitoring devices (such as movement sensors in the house, electricity usage monitors, and home alarm systems) may provide additional useful information on the changes in behavior leading up to an intervenable event. Moreover, in future work, the feasibility of combining the proposed method with clinical interventions (such as teleconsultations and drug dose modification) will need to be investigated to measure the potential impact of the framework described in this paper.
completeness percentage activity counts
mean of activity counts
mode of activity counts
kurtosis activity counts
skewness activity counts
SD of activity counts
Automated Monitoring of Symptom Severity
number of times the participant was at the home location
area under the curve of the receiver operator curve
area under the precision-recall curve
sum of Haversine distances between all locations to the home location.
sum of the duration of calls
SD of the duration of calls
sum of time without any calls
SD of the time without any calls
heart failure
Health Insurance Portability and Accountability Act
Kansas City Cardiomyopathy Questionnaire
short Kansas City Cardiomyopathy Questionnaire
set of features for each KCCQ-12 survey domains separately
summation scores for all KCCQ-12 survey domains
National Heart, Lung, and Blood Institute
National Institutes of Health
total number of calls
positive predictive value
Shapley additive explanation
true positive rate
number of times the participant was within a 2-km radius from home
number of time the participant was outside the 2-km radius from home
The authors wish to acknowledge the support of the National Science Foundation Award (#1636933); “BD Spokes: SPOKE: SOUTH: Large-Scale Medical Informatics for Patient Care Coordination and Engagement”; the National Institutes of Health (NIH)/National Heart, Lung, and Blood Institute (NHLBI; award #K23 127251); the Georgia Research Alliance; and the National Center for Advancing Translational Sciences of the National Institutes of Health (award #UL1TR002378).
None declared.