This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
Digital data on physical activity are useful for self-monitoring and preventing depression and anxiety. Although previous studies have reported machine or deep learning models that use physical activity for passive monitoring of depression and anxiety, there are no models for workers. The working population has different physical activity patterns from other populations, which is based on commuting, holiday patterns, physical demands, occupations, and industries. These working conditions are useful in optimizing the model used in predicting depression and anxiety. Further, recurrent neural networks increase predictive accuracy by using previous inputs on physical activity, depression, and anxiety.
This study evaluated the performance of a deep learning model optimized for predicting depression and anxiety in workers. Psychological distress was considered a depression and anxiety indicator.
A 2-week longitudinal study was conducted with workers in urban areas in Japan. Absent workers were excluded. In a daily survey, psychological distress was measured using a self-reported questionnaire. As features, activity time by intensity was determined using the Google Fit application. Additionally, we measured age, gender, occupations, employment status, work shift types, working hours, and whether the response date was a working or nonworking day. A deep learning model, using long short-term memory, was developed and validated to predict psychological distress the next day, using features of the previous day. Further, a 5-fold cross-validation method was used to evaluate the performance of the aforementioned model. As the primary indicator of performance, classification accuracy for the severity of the psychological distress (light, subthreshold, and severe) was considered.
A total of 1661 days of supervised data were obtained from 236 workers, who were aged between 20 and 69 years. The overall classification accuracy for psychological distress was 76.3% (SD 0.04%). The classification accuracy for severe-, subthreshold-, and light-level psychological distress was 51.1% (SD 0.05%), 60.6% (SD 0.05%), and 81.6% (SD 0.04%), respectively. The model predicted a light-level psychological distress the next day after the participants had been involved in 3 peaks of activity (in the morning, noon, and evening) on the previous day. Lower activity levels were predicted as subthreshold- and severe-level psychological distress. Different predictive results were observed on the basis of occupations and whether the previous day was a working or nonworking day.
The developed deep learning model showed a similar performance as in previous studies and, in particular, high accuracy for light-level psychological distress. Working conditions and long short-term memory were useful in maintaining the model performance for monitoring depression and anxiety, using digitally recorded physical activity in workers. The developed model can be implemented in mobile apps and may further be practically used by workers to self-monitor and maintain their mental health state.
Physical activity is an important health-related bodily activity for treating and preventing depression and anxiety [
Recent studies have shown that physical activity measured using digital tools serve as digital biomarkers in the passive monitoring of depression and anxiety. Furthermore, novel digital technologies and machine or deep learning models have used physical activity to predict depression and anxiety [
However, no studies have developed models that use working conditions to passively monitor depression and anxiety in workers by measuring physical activity using digital tools. A previous study [
This study aims to evaluate the predictive and classification performance of a deep learning model for analyzing depression and anxiety, that is, psychological distress, which has been optimized for workers. We used workers’ physical activity time (measured using a smartphone app) and the psychological distress state from the previous day as features to monitor their psychological distress on the next day. Additionally, working conditions (information on whether the previous day was a working day or not), occupation, employment status, shift type, and working hours were considered features. The long short-term memory (LSTM) model was used as a deep learning model. We hypothesized that the deep learning model developed using the abovementioned characteristics would have similar or better classification performances than models used in previous studies [
A 2-week longitudinal study with workers was conducted from November 2021 to April 2022 to measure their daily psychological distress and obtain digital data on their physical activity and working conditions. Participating workers in Japan were recruited from private companies in the Kanto region and the social networking platform Twitter. Recruitments, collection of informed consent, and data collection, which included conducting surveys, were carried out digitally via email. The following were the eligibility criteria for the study participants: (1) working in a public- or private-sector organization, (2) living or working in urban areas, and (3) having a personal smartphone. We excluded workers who were absent at baseline.
A total of 236 workers, aged between 20 and 69 years, participated in this study.
Initially, the participants were asked to complete a baseline survey to assess their working conditions. The first page of the baseline survey explained the terms and conditions of the study, which participants had to read and approve before proceeding to the next page. Additionally, they were asked to install the Google Fit app [
Characteristics of participants at baseline (N=236).
Characteristics | Participants, n (%) | ||
|
|||
|
<20 | 0 (0) | |
|
20-29 | 40 (16.9) | |
|
30-39 | 72 (30.5) | |
|
40-49 | 64 (27.1) | |
|
50-59 | 53 (22.5) | |
|
60-69 | 7 (3.0) | |
|
≥70 | 0 (0) | |
|
|||
|
Male | 132 (55.9) | |
|
Female | 104 (44.1) | |
|
Others or not responded | 0 (0) | |
|
|||
|
Managers | 45 (23.3) | |
|
Professions, engineers, or academics | 68 (27.9) | |
|
Clerks | 64 (23.3) | |
|
Sales | 27 (11.4) | |
|
Services | 14 (5.9) | |
|
Transportation | 2 (0.8) | |
|
Construction | 0 (0) | |
|
Production/Skilled | 5 (2.1) | |
|
Agriculture/Forestry/Fisheries | 0 (0) | |
|
Others | 11 (4.7) | |
|
|||
|
Full-time | 202 (85.6) | |
|
Part-time | 12 (5.1) | |
|
Dispatched | 3 (1.3) | |
|
Contract | 10 (4.2) | |
|
Others | 9 (3.8) | |
|
|||
|
Day shift | 223 (94.5) | |
|
Rotation shift | 7 (2.9) | |
|
Night shift | 0 (0) | |
|
Others | 6 (2.5) | |
|
|||
|
1-34 | 21 (8.9) | |
|
35-40 | 60 (25.4) | |
|
41-50 | 114 (48.3) | |
|
51-60 | 34 (14.4) | |
|
61-65 | 4 (1.7) | |
|
66-70 | 1 (0.4) | |
|
≥71 | 2 (0.8) |
In the daily survey, psychological distress of the participants was measured using the Japanese version of the K6 scale [
Digital data on physical activity were obtained using the Google Fit app. Based on a previous systematic review [
As an additional feature, working conditions were measured under the baseline and daily surveys. In the baseline survey, we obtained data on the age (<20, 20-29, 30-39, 40-49, 50-59, 60-69, and ≥70 years), gender (male, female, others, and ones with no response), type of occupation (management, engineering or education, general office tasks, sales, services, transportation, construction, and production or skill, and agriculture, forestry, or fisheries), employment status (full-time, part-time, dispatched, and contract), work shift types (day shift, rotation, and night shift), and working hours per week (1-34, 35-40, 41-50, 51-60, 61-65, 66-70, and ≥71 hours). In the daily survey, we measured whether the response date was that of a working or nonworking day.
A deep learning model was developed to predict the log-transformed psychological distress in the participants on the day after their features were collected.
A 5-fold cross-validation method was used to develop and evaluate the deep learning algorithm, using the K-Folds cross-validator in scikit-learn. The daily supervised data were randomly divided into 5 subsets, and each subset was used as the training and test data sets in rotation. Performance evaluation was conducted for each trained model using the test subsets, and the overall model evaluation was calculated as an average of the 5 performance scores.
As the deep learning model, the LSTM model was used, which is a recurrent neural network model that has feedback connections for handling consecutive inputs [
Procedure for developing, validating, and evaluating the deep learning algorithms.
As the primary indicator of performance, classification accuracy for the severity of the psychological distress was considered: light level (<5), subthreshold level (≥5 and <13), and severe level (≥13). As secondary indicators, the Pearson correlation coefficient (
To interpret how the deep learning model predicted and classified the scores and severity of psychological distress, the means of the activity time by intensity were depicted and stratified in accordance with the classification results (light, subthreshold, or severe level) in the training data.
The study protocol was approved by the Kitasato University Medical Ethics Organization (B21-119). Informed consent was obtained before the baseline survey. On the web page, potential participants were asked to read and approve the terms and conditions of the study before proceeding to the baseline survey page. The terms and conditions stated that the research team would protect the privacy and confidentiality of the collected data and that the study data would be deidentified before analyses. The participants did not receive any compensation in this study.
A total of 1661 days of supervised data were obtained from 236 participants. The mean K6 score was 2.78 (SD 4.27). A total of 131 days of missing values for the K6 scores and the type of day (ie, working or nonworking) were imputed using the median. The median of K6 score was 1, and the median type of day was working. The cross-validation method divided the data into 5 sub–data sets: 4 subsets comprised 332 days of data and the others comprised 333 days of data. Of these, 96.40 (SD 6.0) days (29.02%, SD 1.8%) were nonworking days. For the learning process, 11 to 81 epochs were required to complete the process (
Classification performance of the 5 test sub–data sets.
Number of classified data (sub–data sets #1, #2, #3, #4, and #5) | Measured psychological distress (days), n | Total (days), n | |||||||||
|
Severe (K6 score≥13) | Subthreshold (K6 score≥5 and <13) | Light (K6 score<5) |
|
|||||||
|
|||||||||||
|
|
||||||||||
|
|
Sub–data set 1 |
|
3 | 2 | 13 | |||||
|
|
Sub–data set 2 |
|
5 | 0 | 12 | |||||
|
|
Sub–data set 3 |
|
9 | 2 | 22 | |||||
|
|
Sub–data set 4 |
|
7 | 0 | 12 | |||||
|
|
Sub–data set 5 |
|
4 | 1 | 12 | |||||
|
|
||||||||||
|
|
Sub–data set 1 | 7 |
|
59 | 105 | |||||
|
|
Sub–data set 2 | 3 |
|
41 | 88 | |||||
|
|
Sub–data set 3 | 5 |
|
43 | 80 | |||||
|
|
Sub–data set 4 | 4 |
|
36 | 84 | |||||
|
|
Sub–data set 5 | 5 |
|
49 | 87 | |||||
|
|
||||||||||
|
|
Sub–data set 1 | 3 | 24 |
|
215 | |||||
|
|
Sub–data set 2 | 3 | 21 |
|
232 | |||||
|
|
Sub–data set 3 | 3 | 16 |
|
230 | |||||
|
|
Sub–data set 4 | 2 | 13 |
|
236 | |||||
|
|
Sub–data set 5 | 1 | 22 |
|
233 | |||||
|
|||||||||||
|
Sub–data set 1 | 18 | 66 | 249 | 333 | ||||||
|
Sub–data set 2 | 13 | 70 | 249 | 332 | ||||||
|
Sub–data set 3 | 19 | 57 | 256 | 332 | ||||||
|
Sub–data set 4 | 11 | 64 | 257 | 332 | ||||||
|
Sub–data set 5 | 13 | 59 | 260 | 332 |
aCells with italicized values indicate accurate classification.
The mean of the activity time, which was stratified in accordance with working conditions, showed different patterns based on the prediction of the psychological distress severity. The activity time on a working day showed a similar trend to the overall results. Contrarily, those on a nonworking day exhibited very different results (
Duration of light and moderate to vigorous physical activity (minutes) within a day stratified by the severity of psychological distress.
The deep learning model developed using LSTM, based on the physical activities and working conditions, revealed a similar performance as reported in previous studies [
The prediction and classification performance of the deep learning model was similar to that in a previous study [
Interestingly, there were predictive differences in working conditions. Shiftable peaks in the later activities and high levels of activity on nonworking days led to high psychological distress. Additionally, excessive workload peaks among managers, sales, and service workers were predicted to be subthreshold- and severe-level psychological distress, respectively. On the one hand, among managers, physical activity at night (after 8 PM) might be related to the severe-level distress. Service workers had several peaks of physical activity, and higher-level peaks were predicted as subthreshold-level distress. On the other hand, clerks and professionals did not have a similar trend: lower whole levels of physical activities were associated with subthreshold- and severe-level psychological distress. These differences might depend on the activity levels of their demanding work. Managerial, sales, and service jobs are qualitatively different from clerical and professional jobs and tend to be more physically demanding. Excessive physical activity could affect the psychological distress of workers if the activities at work are physically demanding. These findings may be attributed to poor sleep and circadian rhythms [
The classification performance of the developed model for the severe-level psychological distress was not high. Misclassified samples included more data from rotation and other shift workers (12%) than from the whole sample. Workers who were engaged in shift work had a different rhythm from day-shift workers. Hence, they needed different algorithms to predict the level of psychological distress. This study did not cover much data on shift workers. Consequently, further studies are needed to tune the model.
Several limitations limited the validity and generalizability of the study. The lack of numbers and variations in the data monitored led to low generalizability. Particularly, severely distressed night-shift workers were lacking. The results could not be directly compared with those of previous studies because the latter used different measurements and digital biomarkers. Previous studies surveyed students and patients and mainly used the Patient Health Questionnaire as an indicator for depression. Information on sleep, an important digital biomarker used in the previous studies [
In conclusion, the developed deep learning model performed similarly to those reported in previous studies and had high accuracy in determining light-level psychological distress as a function of physical activity and working conditions. The collected information on the working conditions was useful in passively monitoring the depression and anxiety status of workers. It further contributed in determining the mental health status of workers by using digital biomarkers. The developed model can be used in mobile apps and among workers to self-monitor and maintain their mental health state.
Mean squared error as a loss function during training in the five test subsets.
Duration of light and moderate to vigorous physical activity (minutes) within a day stratified by the severity of psychological distress for working and non-working days.
Duration of light and moderate to vigorous physical activity (minutes) within a day stratified by the severity of psychological distress for different occupations.
long short-term memory
This study was supported by the Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JP20K19671) and the Japan Agency for Medical Research and Development (JP21de0107006). The funders neither played any role in the study design, collection, analysis, and interpretation of data, nor did they participate in the writing of the report and in the decision to submit the article for publication. We would like to thank Cactus Communications Co, Tokyo for the English language editing.
The data sets generated and analyzed in this study are available with the corresponding author upon reasonable request.
None declared.