Original Paper
Abstract
Background: The rapid increase of single-person households in South Korea is leading to an increase in the incidence of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases, due to lifestyle changes. It is necessary to analyze the complex effects of metabolic syndrome risk factors in South Korean single-person households, which differ from one household to another, considering the diversity of single-person households.
Objective: This study aimed to identify the factors affecting metabolic syndrome in single-person households using machine learning techniques and categorically characterize the risk factors through latent class analysis (LCA).
Methods: This cross-sectional study included 10-year secondary data obtained from the National Health and Nutrition Examination Survey (2009-2018). We selected 1371 participants belonging to single-person households. Data were analyzed using SPSS (version 25.0; IBM Corp), Mplus (version 8.0; Muthen & Muthen), and Python (version 3.0; Plone & Python). We applied 4 machine learning algorithms (logistic regression, decision tree, random forest, and extreme gradient boost) to identify important factors and then applied LCA to categorize the risk groups of metabolic syndromes in single-person households.
Results: Through LCA, participants were classified into 4 groups (group 1: intense physical activity in early adulthood, group 2: hypertension among middle-aged female respondents, group 3: smoking and drinking among middle-aged male respondents, and group 4: obesity and abdominal obesity among middle-aged respondents). In addition, age, BMI, obesity, subjective body shape recognition, alcohol consumption, smoking, binge drinking frequency, and job type were investigated as common factors that affect metabolic syndrome in single-person households through machine learning techniques. Group 4 was the most susceptible and at-risk group for metabolic syndrome (odds ratio 17.67, 95% CI 14.5-25.3; P<.001), and obesity and abdominal obesity were the most influential risk factors for metabolic syndrome.
Conclusions: This study identified risk groups and factors affecting metabolic syndrome in single-person households through machine learning techniques and LCA. Through these findings, customized interventions for each generational risk factor for metabolic syndrome can be implemented, leading to the prevention of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases. In conclusion, this study contributes to the prevention of metabolic syndrome in single-person households by providing new insights and priority groups for the development of customized interventions using classification.
doi:10.2196/42756
Keywords
Introduction
Background
Single-person households have rapidly increased from 9% in 1990 to 29.3% in 2018, accounting for one-third of all South Korean households [
], and are estimated to reach approximately 36.3% by 2045 [ ]. This increasing trend is also evident worldwide, including the United States (26.7%), Australia (23.9%), and Japan (32.4%) [ , ].The reasons for this rising trend include the large number of unmarried people and late marriages, resulting in changes in marital values, divorce, separation, high unemployment, and diverse and complex social factors in larger cities [
]. On the basis of individuals’ sociodemographic characteristics and lifestyle, single-person households are more susceptible to exposure to high-risk health behaviors, such as smoking and alcohol consumption, as well as experiences of depression and stress, than multiperson households [ - ].Adult single-person households are known to show distinct differences from multiperson households in terms of demographic characteristics and living habits. For instance, it has been reported that single-person households are more likely than multiperson households to be more susceptible to health problems [
- ]. In addition, compared with multiperson households, single-person households are more exposed to high-risk health behaviors, such as smoking and drinking, and experience more depression and stress [ , ].These sociodemographic characteristics and lifestyles indicate that single-person households have a higher prevalence of metabolic syndrome and chronic diseases, such as hypertension, diabetes, dyslipidemia, arthritis, asthma, myocardial infarction, and stroke [
- ].Metabolic syndrome leads to cardiovascular disease and a risk of diabetes [
], involving at least 3 clinical characteristics, namely hypertension, hyperglycemia, and hypertriglyceridemia, and high levels of low-density lipoprotein, as well as to abdominal obesity [ , ]. It also increases the occurrence of myocardial infarction, stroke, and dementia [ , , , ]; therefore, it is important to decrease the incidence of metabolic syndrome to prevent chronic cardiac and cerebrovascular diseases and reduce the mortality rate [ , ].It is also necessary to assess the morbidity associated with the disease and develop customized medications and guidelines to manage its risk factors [
]. Previous studies have demonstrated that risk factors include age, sex, obesity, smoking, a lack of physical activity, and education [ - ] Although single-person households include various characteristics, their influences on metabolic syndrome may differ from those of multiperson households and across age groups [ , ]. This necessitates a more holistic and systematic understanding of the metabolic syndrome risk factors in single-person households [ ], as each risk factor may have a discriminatory or an interrelated effect on metabolic syndrome depending on individual characteristics [ ].Latent class analysis (LCA), a human-centered approach, checks the multidimensional characteristics of human behavior; it contrasts with a conventional variable-centered approach, which describes predictors’ relative influence on outcome variables [
- ]. In addition, identifying the patient type and characteristics is advantageous in predicting the disease, and a customized intervention program can be planned according to individual risk factor vulnerabilities and diagnosis [ - ]. Machine learning refers to a method of automatically extracting general rules or new knowledge by implementing learning ability, one of the unique intelligence functions of humans, through machines and analyzing the given data [ , ]. In this study, the factors affecting metabolic syndrome in South Korean single-person households were analyzed using logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boost (XGBoost). LR, DT, and RF are the most commonly used machine learning techniques, and XGBoost is a machine learning technique that has recently emerged [ - ].This study aimed to identify the factors affecting metabolic syndrome in single-person households using machine learning with large-scale health data from the National Health and Nutrition Examination Survey (NHANES) [
]. However, few studies have applied machine learning and LCA to identify the factors affecting metabolic syndrome in single-person households [ , , ]. The contribution or significance of this study is not finding any exact answer but finding new variables or overlooked parts through basic research or translational research for clinical application. The core value of translational research lies in its effort to apply basic research to clinical practice with a high success rate at a low cost in a short period.Hence, this study was designed to establish basic data to develop customized interventions by categorizing and characterizing metabolic syndrome risk factors in South Korean single-person households using machine learning techniques and LCA.
Purpose of This Study
This study used data from the NHANES spanning 10 years (2009-2018), applied machine learning techniques to identify the factors that affect the occurrence of metabolic syndrome, and applied LCA to classify single-person households. The purpose of this study was to categorize risk groups and identify risk factors for metabolic syndrome in South Korean single-person households.
Methods
Research Design
This study was a secondary data analysis that used machine learning techniques and LCA to categorize metabolic syndrome risk factors to identify the factors influencing the occurrence of metabolic syndrome in single-person households. The overall flowchart of the study is shown in
.Participants of the Study
This study used raw data from the 10-year NHANES (2009-2018) conducted by the Ministry of Health and Welfare and the Korea Centers for Disease Control and Prevention for a secondary data analysis. The South Korean NHANES generated data representative of the South Korean population using stratified colony sampling. The total number of respondents was 83,294, among whom there were 1376 (1.65%) single-person households, and 79,717 (95.71%) households with ≥2 persons. Of the 1376 single-person households, 1371 (99.64%) were finally selected as study participants, excluding 5 households because of missing data and older age.
Data Set
We used the health questionnaire from the NHANES’s fourth (2009), fifth (2010-2012), sixth (2013-2015), and seventh terms (2016-2018).
General Characteristics
We selected participants with the following characteristics: sex (male or female), age (early adulthood, ie, 19-39 y of age, and middle adulthood, ie, 40-64 y of age), educational level (lower than high school to higher than an undergraduate [4-year] college degree), marital status (married or unmarried), income level, and economic activity status (active or inactive).
Health Behavior
We selected smoking behavior (smoking or nonsmoking); alcohol consumption (abstaining, <4 times/mo, or >2 times/wk); exercise, such as walking (<3 times/wk or >3 times/wk); subjective recognition of body type (very thin, slightly thin, normal, slightly obese, or very obese); subjective health status (very good, good, normal, bad, or very bad); and obesity status indicated by BMI (<18.5=underweight, 18.5-22.9=normal, 23-24.9=at risk, or ≥25=obese).
Eating Habits
We administered a questionnaire to determine how frequently respondents dined out (5 times/wk, 1-4 times/wk, and <3 times/mo) and their dietary lifestyle (“good” or “bad”).
Mental Health
We assessed respondents’ awareness of stress (recognition or nonrecognition) and diagnoses of depression (diagnosed or undiagnosed).
Use of Medical Institutions and Community Services
We classified participants based on health, cancer, and oral cavity using “yes” or “no” responses and included the type of health insurance (local, employment-related, or uninsured or self-paying medical care) and subscription to private medical insurance (registered or unregistered for private medical insurance).
Metabolic Syndrome
We determined the presence of metabolic syndrome based on the National Cholesterol Education Program–Adult Treatment Panel 3 diagnostic criteria [
] and whether respondents possessed ≥3 of the following 5 criteria: hypertension, hyperglycemia, hypertriglyceridemia, hypo–high-density lipoproteinemia, and abdominal obesity. Waist circumference, triglycerides, high-density lipoprotein cholesterol levels, final systolic and diastolic blood pressures (mean of the second and third measurements), and fasting blood glucose level were used to determine the existence of metabolic syndrome.Data Collection Method
We submitted our affiliation and purpose of using the data to the Korea Disease Control and Prevention Agency’s data portal and then used the data, which contained no personal information.
Data Preprocessing
After sampling and merging the 10-year data from the NHANES, we conducted a data-cleansing process, and the distribution of variables was confirmed using the missing values function of the SPSS software (version 25.0; IBM Corp) to identify both ideal values and missing data [
].In this study, data from a total of 83,294 individuals who participated in the 10-year (2009-2018) NHANES and application year survey were extracted. After extracting cases where the number of households (code name=cfam) was “1,” out of 83,294 households, we found 3577 (4.29%) single-person households from 2009 to 2018. Of these 3577 individuals from single-person households, 1371 (38.33%) were finally selected after excluding older adults (aged ≥65 y) and those with missing values.
After extracting 10 years of data from the NHANES, this study went through a lightweight process, and to check outliers and missing values in the data, the missing value program of SPSS was used to check the weight of the group. A total of 1182 cases were finalized, processed, and deleted to confirm the initial and intermediate defects applied in the overlapping files of 10 years of data from the NHANES. For the analysis, age, a continuous variable, was converted into a categorical variable, and a metabolic syndrome variable was newly created in the case of having at least 3 of hypertension, hyperglycemia, high-density lipoproteinemia, hypertriglyceridemia, and abdominal obesity. The case of having 3 or more of each currency was made a reimbursement syndrome. Metabolic syndrome was analyzed according to the National Cholesterol Education Program–Adult Treatment Panel 3 diagnostic criteria [
].In this study, when the influencing factors of the syndrome were analyzed by applying LR, DT, RF, and XGBoost among machine learning methods, the total number of discussions of the 10-year data from the NHANES was 7450. From 2009 to 2018, there were 390 results of splitting the data using the 10-fold cross-validation method. Among them, 154 items that accumulated drainage, 5 diagnostic criteria for metabolic syndrome unrelated to measurements, and study participants were analyzed as factors influencing the occurrence of metabolic syndrome in single-person households based on the code name MetS (metabolic syndrome or not).
We applied LR, DT, RF, and XGBoost algorithms among machine learning techniques with a total of 7450 variables of the 10-year NHANES data to analyze the influencing factors of metabolic syndrome.
Statistical Analysis
Data were analyzed using SPSS, Mplus (version 8.0; Muthen & Muthen), and Python (version 3.0; Plone & Python).
Ethical Considerations
We performed data analysis after obtaining approval from Keimyung University’s ethics committee for an exemption from deliberation (institutional review board number 40525-202008-HR-043-01) because we used existing data or published documents instead of directly engaging with participants.
Results
Respondents’ General Characteristics
Of the 1371 respondents, 681 (49.67%) were male, 893 (65.13%) were middle-aged adults, and 807 (58.86%) had less than a high school education. Further, among the 1371 respondents, 990 (72.21%) had active economic activity, and 384 (28.01%) had low or intermediate income levels. Of the 1371 respondents, 705 (51.42%) were married, 932 (67.98%) were nonsmokers, and 749 (54.63%) consumed alcohol <4 times a month. Moreover, of the 1371 respondents, 938 (68.42%) walked >3 times a week, 518 (37.78%) considered themselves to have a normal body weight, 668 (48.72%) were subjectively healthy, and 541 (39.46%) recognized their subjective body type.
In addition, of the 1371 respondents, 602 (43.91%) and 778 (56.75%) respondents indicated that their father and mother had an elementary school education, respectively. Regarding mental health, 930 (67.83%) of the 1371 respondents were not aware of stress, and 1247 (90.96%) of the 1371 respondents were not diagnosed with depression (
).Variable | Participant, n (%) | |
Sex | ||
Male | 681 (49.67) | |
Female | 690 (50.33) | |
Age (years) | ||
Early adulthood (19-39) | 478 (34.87) | |
Middle adulthood (40-64) | 893 (65.13) | |
Educational level | ||
≤High school | 807 (58.86) | |
≥College | 564 (41.14) | |
Economic activity status | ||
Active | 990 (72.21) | |
Inactive | 381 (27.79) | |
Income level | ||
Low | 340 (24.8) | |
Lower intermediate | 384 (28.01) | |
Upper intermediate | 326 (23.78) | |
Advanced | 321 (23.41) | |
Marital status | ||
Married | 705 (51.42) | |
Single | 666 (48.58) | |
Smoking | ||
Smoker | 439 (32.02) | |
Nonsmoker | 932 (67.98) | |
Frequency of drinking | ||
None | 260 (18.96) | |
<4 times/mo | 749 (54.63) | |
>2 times/wk | 362 (26.4) | |
Days of walking | ||
<3 times/wk | 433 (31.58) | |
>3 times/wk | 938 (68.42) | |
Obesity status | ||
Underweight | 58 (4.23) | |
Normal | 518 (37.78) | |
Overweight | 339 (24.73) | |
Obese | 456 (33.26) | |
Subjective health status | ||
Very healthy | 59 (4.3) | |
Healthy | 285 (20.79) | |
Normal | 668 (48.72) | |
Unhealthy | 286 (20.86) | |
Very unhealthy | 73 (5.32) | |
Subjective body shape recognition | ||
Very thin | 41 (2.99) | |
Little thin | 162 (11.82) | |
Normal | 541 (39.46) | |
Little obese | 478 (34.87) | |
Very obese | 149 (10.87) | |
Recognition status of stress | ||
No recognition | 930 (67.83) | |
Recognition | 441 (32.17) | |
Depression diagnosis by physician | ||
Negative | 1247 (90.96) | |
Positive | 124 (9.04) | |
Dietary condition | ||
Good | 1267 (92.41) | |
Poor | 104 (7.59) | |
Frequency of eating out | ||
2 times/d->5 times/wk | 731 (53.32) | |
1 time/wk-4 times/wk | 398 (29.03) | |
<3 times/mo | 242 (17.65) | |
Father’s educational level | ||
Elementary school graduate | 602 (43.91) | |
Middle school graduate | 204 (14.88) | |
High school graduate | 335 (24.43) | |
College graduate | 230 (16.78) | |
Mother’s educational level | ||
Elementary school graduate | 778 (56.75) | |
Middle school graduate | 197 (14.37) | |
High school graduate | 290 (21.15) | |
College graduate | 106 (7.73) | |
Health checkup status | ||
Yes | 877 (63.97) | |
No | 494 (36.03) | |
Cancer checkup status | ||
Yes | 633 (46.17) | |
No | 738 (53.83) | |
Oral examination | ||
Yes | 898 (65.5) | |
No | 473 (34.5) | |
Type of health insurance | ||
Regional health insurance | 491 (35.81) | |
Company health insurance | 757 (55.22) | |
Medical care | 123 (8.97) | |
Private health insurance | ||
Joined | 1066 (77.75) | |
Not joined | 305 (22.25) |
Analysis of the Factors Influencing Metabolic Syndrome Using Machine Learning Techniques
We observed 390 common variables from 10 years of merged data (2009-2018) of the NHANES. Among them, 154 were excluded because they did not comply with the study, and 236 missing variables were analyzed to assess the factors affecting metabolic syndrome in single-person households.
Overall, 4 algorithms were applied in the analysis: LR, DT, RF, and XGBoost. The importance of the variables age, BMI, and subjective recognition of body type as extracted from LR was 212.56, 173.26, and 138.01, respectively. Furthermore, the importance of the variables BMI and dietary condition as extracted from DT was 35.50, 7.07, and 5.53, respectively. The importance of the variables BMI, obesity, and age as extracted from RF was 7.07, 2.99, and 2.80, respectively. Finally, the importance of the variables status of drinking, weight control, and age as extracted from XGBoost was 6.34, 5.81, and 3.06, respectively (
). To summarize, we found age, BMI, obesity, and the subjective recognition of body type to be the most important common variables.Algorithm and variable name | Feature importance | ||
Logistic regression | |||
Age | 212.56 | ||
BMI | 173.26 | ||
Subjective body shape recognition | 138.01 | ||
Amount of alcohol consumption at a time | 87.98 | ||
Type of longest job | 72.51 | ||
Subjective health status | 71.79 | ||
Diagnosis of osteoarthritis | 68.84 | ||
Diagnosis of arthritis | 62.79 | ||
Daily activities | 60.61 | ||
Status of other cancer treatments | 57.59 | ||
Decision tree | |||
Age | 30.50 | ||
BMI | 7.07 | ||
Dietary conditions | 5.53 | ||
Duration of walking | 2.00 | ||
Age at which alcohol consumption began | 1.99 | ||
Modified working hours | 1.37 | ||
Status of smoking | 1.31 | ||
Frequency of binge drinking | 1.15 | ||
Frequency of eating out | 0.89 | ||
Region | 0.74 | ||
Random forest | |||
BMI | 7.08 | ||
Obesity status | 2.99 | ||
Age | 2.80 | ||
Subjective body shape recognition | 2.67 | ||
Type of longest job | 1.14 | ||
Region | 1.22 | ||
Age at which alcohol consumption began | 1.11 | ||
Standard occupation classification | 0.81 | ||
Education level | 1.02 | ||
Walking days in a week | 0.78 | ||
Extreme gradient boost | |||
Status of drinking | 6.34 | ||
Weight control method | 5.81 | ||
Driving under the influence during 1 y | 3.06 | ||
Smoking cessation plan | 1.91 | ||
Duration of disease state | 1.64 | ||
Self-management | 1.60 | ||
Age | 1.60 | ||
Status of nutrition display impact | 1.56 | ||
Daily activities | 1.44 | ||
Status of pulmonary tuberculosis diagnosis | 1.18 |
Determining the Number of Latent Class Layers
LCA was used to determine 4 indices of the model’s goodness of fit: Bayesian information criteria, sample size–adjusted Bayesian information criteria, Lo-Mendell-Rubin adjusted likelihood ratio test, and bootstrapped likelihood ratio test. We determined the number of class layers through a preferential check of each measured model’s goodness-of-fit index. In particular, we increased the number of layers, as illustrated in
, and used several influencing factors to reveal the presence of metabolic syndrome in single-person households, finally deciding on 4 latent classes.Group number | Model fit indices | Classification of latent class, n (%) | |||||||
BICa | SSABICb | BLRTc | LMRd | 1 | 2 | 3 | 4 | 5 | |
1 | 20,874.18 | 20,782.18 | 0.000 | 0.000 | 508 (37.05) | 863 (62.95) | N/Ae | N/A | N/A |
2 | 2050.91 | 20,366.14 | 0.000 | 0.000 | 655 (47.78) | 298 (21.74) | 418 (30.49) | N/A | N/A |
3 | 20,276.99 | 20,089.57 | 0.000 | 0.000 | 354 (25.82) | 329 (24) | 320 (23.34) | 368 (26.8) | N/A |
4 | 20,266.98 | 20,087.23 | 0.519 | 0.506 | 310 (22.61) | 186 (13.57) | 319 (23.27) | 306 (22.32) | 250 (18.23) |
aBIC: Bayesian information criteria.
bBLRT: bootstrapped likelihood ratio test.
cLMR: Lo-Mendell-Rubin adjusted likelihood ratio test.
dSSABIC: sample size–adjusted Bayesian information criteria.
eN/A: not applicable.
Names and Characteristics of the Latent Classes
It is important to select latent class classification variables to identify the factors affecting metabolic syndrome in single-person households through an in-depth consideration of prior research results [
, ].Therefore, to diagnose metabolic syndrome, we selected sex, age, smoking, alcohol consumption, walking, obesity, hypertriglyceridemia, high blood pressure, high blood glucose, abdominal obesity, and hypo–high-density lipoproteinemia. On the basis of the characteristics and response patterns of subclass types classified through the LCA, we named these categorized classes as follows: group 1: intense physical activity in early adulthood, group 2: hypertension among middle-aged female respondents, group 3: smoking and drinking among middle-aged male respondents, and group 4: obesity and abdominal obesity among middle-aged male respondents.
and present the characteristics and names of each sublayer type according to each latent class.From the 1371 participants, groups 1, 2, 3, and 4 had 320 (23.34%), 368 (26.84%), 329 (24%), and 354 (25.82%) participants, respectively. First, group 1 was compared with the other 3 groups, with 300 (93.8%) of the 320 participants indicating that age was the most important factor. Moreover, 289 (90.3%) respondents walked >3 times a week, which was substantially higher than that of the other groups. All the 5 diagnostic criteria for metabolic syndrome exhibited low rates, regardless of whether metabolic syndrome was present at 0%. In group 2, out of 368 respondents, 337 (91.6%) were female, and all participants in this group were in their middle adulthood. All the diagnostic criteria for metabolic syndrome exhibited low rates, whereas 47 (12.8%) participants had metabolic syndrome. In group 3, out of 329 respondents, 318 (96.7%) were male, which is more than the number of male respondents in other groups, and 250 (76%) respondents in this group were in their middle adulthood. The rate of smoking was high (n=249, 75.7%), and 181 (55%) participants reported a high frequency of alcohol consumption (>2 times/wk). In addition, 72 (21.9%) respondents had metabolic syndrome. In group 4, out of 354 participants, 255 (72%) participants were in middle adulthood. In terms of the diagnostic criteria for metabolic syndrome, 265 (74.9%) had hypertension, 306 (86.4%) were obese, 354 (100%) had abdominal obesity, and 232 (65.5%) had metabolic syndrome.
Variable | Group 1 (n=320), n (%) | Group 2 (n=368), n (%) | Group 3 (n=329), n (%) | Group 4 (n=354), n (%) | |
Sex | |||||
Male | 143 (44.7) | 31 (8.4) | 318 (96.7) | 189 (53.4) | |
Female | 177 (55.3) | 337 (91.6) | 11 (3.3) | 165 (46.6) | |
Age (years) | |||||
Early adulthood (19-39) | 300 (93.8) | 0 (0) | 79 (24) | 99 (28) | |
Middle adulthood (40-64) | 20 (6.2) | 368 (100) | 250 (76) | 255 (72) | |
Current smoking status | |||||
Smoking | 70 (21.9) | 12 (3.3) | 249 (75.7) | 128 (36.2) | |
Not smoking | 250 (78.1) | 356 (96.7) | 80 (24.3) | 226 (63.8) | |
Frequency of drinking | |||||
No drinking | 18 (5.6) | 132 (35.9) | 34 (10.3) | 76 (21.5) | |
<4 times/mo | 240 (75) | 212 (57.6) | 114 (34.7) | 183 (51.7) | |
>2 times/wk | 62 (19.4) | 24 (6.5) | 181 (55) | 95 (26.8) | |
Frequency of walking days | |||||
<3 times/wk | 31 (9.7) | 122 (33.2) | 107 (32.5) | 133 (37.6) | |
>3 times/wk | 289 (90.3) | 246 (66.8) | 222 (67.5) | 221 (62.4) | |
Obesity | |||||
Low weight | 32 (10) | 15 (4.1) | 11 (3.3) | 0 (0) | |
Normal | 197 (61.6) | 197 (53.5) | 124 (37.7) | 0 (0) | |
Overweight | 59 (18.4) | 112 (30.4) | 170 (51.7) | 48 (13.6) | |
Obese | 32 (10) | 44 (12) | 24 (7.3) | 306 (86.4) | |
Dyslipidemia | |||||
Yes | 22 (6.9) | 73 (19.8) | 171 (52) | 162 (45.8) | |
No | 298 (93.1) | 295 (80.2) | 158 (48) | 192 (54.2) | |
Hypertension | |||||
Yes | 17 (5.3) | 238 (64.7) | 164 (49.8) | 205 (57.9) | |
No | 303 (94.7) | 130 (35.3) | 165 (50.2) | 149 (42.1) | |
Hyperlipidemia | |||||
Yes | 8 (2.5) | 90 (24.5) | 166 (50.5) | 175 (49.4) | |
No | 312 (97.5) | 278 (75.5) | 163 (49.5) | 179 (50.6) | |
Abdominal obesity | |||||
Yes | 0 (0) | 9 (2.4) | 0 (0) | 354 (100) | |
No | 320 (100) | 359 (97.6) | 329 (100) | 0 (0) | |
Hypo–high-density lipoproteinemia | |||||
Yes | 53 (16.6) | 135 (36.7) | 73 (22.2) | 151 (42.7) | |
No | 267 (83.4) | 233 (63.3) | 256 (77.8) | 203 (57.3) |
Division | Group name | Participant (N=1371), n (%) | Diagnosis of metabolic syndrome, n (%) |
Group 1 | Intense physical activity in early adulthood | 320 (23.3) | 0 (0)a |
Group 2 | Hypertension among middle-aged female respondents | 368 (26.8) | 47 (12.8)b |
Group 3 | Smoking and drinking among middle-aged male respondents | 329 (24) | 72 (21.9)c |
Group 4 | Obesity and abdominal obesity among middle-aged respondents | 354 (25.8) | 232 (65.5)d |
an=320.
bn=368.
cn=329.
dn=354.
Relationships Between Latent Class Groups and Metabolic Syndrome
We performed a binary LR to predict metabolic syndrome outbreaks in the categorized latent class groups (
).Regression analysis of the groups, as classified by the LCA (independent variables) and occurrence of metabolic syndrome (dependent variable) was significant (χ23=521.7, P<.001). Further, the Cox and Snell coefficient of determination (R2=0.57), representing the model’s descriptive power, was 57%. The Hosmer-Lemeshow test results for the prediction model (χ²3=12.7, P=.49) demonstrated that no differences existed between the observed and predicted values.
The group with intense physical activity in middle adulthood was established as a reference category. In comparison, groups 2, 3, and 4 were 5.09 times (95% CI 3.15-14.91; P<.001), 8.99 times (95% CI 5.74-21.72; P<.001), and 17.67 times (95% CI 14.45-25.33; P<.001) more likely to experience metabolic syndrome, respectively.
Group | B (SE) | Odds ratio (95% CI) | P value |
Intense physical activity in early adulthood | Referenceb | Reference | Reference |
Hypertension among middle-aged female respondents | 2.08 (0.22) | 5.09 (3.15-14.91) | .001 |
Smoking and drinking among middle-aged male respondents | 2.94 (0.29) | 8.99 (5.74-21.72) | .001 |
Obesity and abdominal obesity among middle-aged male respondents | 3.75 (0.30) | 17.67 (14.45-25.33) | .001 |
aR²=0.57, χ²3=12.7 in Hosmer-Lemeshow test; P=.49.
bSet as reference category in latent class analysis.
Discussion
Principal Findings
This study is the first to identify risk factors for metabolic syndrome in South Korean single-person households from multiple angles using LCA and machine learning techniques. The purpose of this study was to classify the risk factors for metabolic syndrome in single-person households using LCA and to identify the types and characteristics of the classified latent class. This paper describes metabolic health (BMI, body weight, body fat percentage, blood pressure, and blood sugar) among the physical and social characteristics of single-person households. There were more single-person households in middle adulthood (40-64 y) than in early adulthood (19-39 y). In this study, age, BMI, obesity, drinking, and body shape were found as potential risk factors for metabolic syndrome in single-person households. A cross-sectional study such as this is necessary because it can identify the factors that affect metabolic syndrome in single-person households in South Korea and determine which factors should be targeted through appropriate intervention [
].Existing studies on metabolic syndrome were conducted mainly among older and middle-aged adults [
- ]. Among recent studies, several studies have confirmed the presence metabolic syndrome in the younger generation, suggesting that the metabolic syndrome morbidity rate among generations with various characteristics has increased [ , ]. On the basis of this, it was found that the diversity of single-person households could not be overlooked. Importantly, it has been reported that health habits have substantial influence on metabolic syndrome [ ]. As health habits are already fixed in middle to late adulthood, it is difficult to expect changes in health behavior later; therefore, the prevention and management of metabolic syndrome in early adulthood should be considered [ - ]. Therefore, it is evident that modifying health habits is the most important step in treating or preventing metabolic syndrome.In this study, to categorize the risk factors for metabolic syndrome in adult single-person households, the LR, DT, RF, and XGBoost algorithms, which are machine learning techniques, were applied to identify factors that affect the occurrence of metabolic syndrome in adult single-person households. In this analysis, variables such as age, BMI, obesity, alcohol consumption, and subjective body shape recognition were commonly derived. This suggests that the factors identified in previous studies as affecting metabolic syndrome in adult single-person households and the factors identified by applying machine learning techniques in this study are consistent with each other [
]. It is important to actively encourage physical activity to prevent metabolic syndrome [ ]. In addition, it is necessary to develop a differentiated health management strategy using mobile health programs for single-person households in early adulthood with sustainable and compelling content relevant to their daily lives.Unlike group 1, group 2 comprised mostly female respondents, primarily in the center of middle adulthood or older. In addition, this group had low rates of smoking and obesity and a high rate of hypertension. These results were consistent with those of previous studies, which indicated that high blood pressure in middle adulthood causes metabolic syndrome [
]. In addition, the rates of normal weight and overweight were the highest and second highest, respectively, in this group, which is consistent with the study by Kang et al [ ], which reported that physical activity reduces hypertension and prevents metabolic syndrome among female individuals. This finding suggests that high blood pressure is an important risk factor for developing metabolic syndrome in single-person households [ ].Hypertension was an important risk factor, as seen in group 2. Thus, to prevent metabolic syndrome in group 2, it is important to develop and implement intervention programs for reducing blood pressure through diet and exercise therapy programs, encourage physical activity, and reduce obesity [
, ].In group 3, the proportion of male respondents was significantly higher. In addition, the rate of smoking, frequency of alcohol consumption, and the rate of obesity were the highest in this group compared with the other groups. Moreover, sex and age were important risk factors for metabolic syndrome, which is consistent with the large proportion of middle-aged respondents in group 3. This group also exhibited characteristics of typical middle-aged workers, indicating the need to observe and manage smoking and alcohol consumption, especially among office workers [
]. These findings coincide with the finding of the study by Oh [ ] that smoking facilitates metabolic syndrome, whereas its cessation prevents it among middle-aged male individuals. Thus, alcohol consumption and smoking were important risk factors for metabolic syndrome in group 3.In this group, 21.9% (72/329) of the participants developed metabolic syndrome, and this group was 8.99 times more likely to develop metabolic syndrome than group 1. This corroborates the findings of Oh [
], as those in middle adulthood are more likely to be exposed to hypertension, hyperlipidemia, smoking, and alcohol consumption; hence, this group requires close monitoring and preventive nursing interventions. Moreover, although stress often leads to a desire to smoke and compels ex-smokers to begin smoking again, it is not fully clear as to why it is difficult to cease smoking [ , ]. Therefore, nursing interventions are needed to increase the motivation to quit smoking.Further, another study discovered that the greater the stress, the higher the risk of health problems, such as smoking and depression [
, ]. Higher nicotine dependence demonstrates that smoking may be an inappropriate response if psychological problems such as stress and depression are not properly managed [ , , ]. In addition, as Korean populations are often exposed to smoking when dining together and drinking socially, it is necessary to establish a culture of smoking cessation and changes in dining manners.In group 4, the proportions of male respondents and female respondents were similar, with a high proportion of respondents in middle adulthood. Further, all respondents in the group exhibited obesity (based on the respondents’ BMI) or abdominal obesity (based on the respondents’ waist circumference). Obesity is also associated with the development of insulin resistance and beta-cell dysfunction, regardless of whether it is accompanied by abdominal obesity, which is consistent with prior literature [
, ]. Our results are also consistent with a report by Detournay et al [ ], which revealed that obesity and abdominal obesity during female menopause may cause metabolic syndrome.In group 4, metabolic syndrome was prevalent among 65.5% (232/354) of the respondents, and this group was 17.67 times more likely to develop metabolic syndrome than group 1. Moreover, the rates of hypertension, hyperglycemia, abdominal obesity, and hypo–high-density lipoproteinemia were higher than those in the other groups. As having at least 3 of the 5 criteria is an important basis for diagnosing metabolic syndrome, this is a critical factor [
].This study’s LCA demonstrated that heterogeneous subgroups exist depending on metabolic syndrome risk factors, which is different from the results of most previous studies that focused on specific metabolic syndrome risk factors. We have proven that certain risk factors may have more prominent effects and affect certain age groups more strongly. Moreover, obesity and abdominal obesity were the most influential risk factors for metabolic syndrome in single-person households.
A national policy to promote physical activity is needed to prevent and manage metabolic syndrome in single-person households. In addition, strategies are needed to develop intervention programs for enhancing physical activity at any time or anywhere through mobile health and wearable devices; such programs would naturally integrate physical activity into daily life. Thus, it would be much more effective to develop and implement different risk-based intervention strategies for different individuals. It would be beneficial if customized mediations based on individual needs could be developed and implemented, taking into consideration subgroup characteristics instead of the collective metabolic syndrome risk factors. Therefore, rather than considering individuals with metabolic syndrome risk factors as a homogenous group and applying the developed interventions collectively, customized interventions should be developed considering the characteristics of each subgroup, and groups that share the same characteristics should be efficiently classified. Such interventions can be made much more effective if they incorporate strategies targeting each of the various risk factors for metabolic syndrome.
Limitations
This study has several limitations. First, the NHANES questionnaire we used could not incorporate various variables. Due to annual changes in the survey questions, data were extracted that matched all 10 years of the survey questions. Second, as the object of investigation differed every year, tracking the longitudinal changes and progress of metabolic syndrome was a challenge. Third, although various machine learning techniques were used in this study, the most commonly used artificial neural network technique was not used. In the future, it will be necessary to conduct research applying deep learning methods such as artificial neural networks.
Conclusions
This study is significant in that it is the first to use latent stratification analysis and machine learning techniques to identify the types and characteristics of potential subgroups classified based on potential metabolic syndrome risk factor indicators in adult single-person households. This study conducted a secondary analysis of data (2009-2018) from the NHANES hosted by the Korea Centers for Disease Control and Prevention, through which it classified and characterized risk factors for metabolic syndrome in adult single-person households.
In this study, machine learning techniques were applied to identify factors affecting metabolic syndrome in adult single-person households, which were identified as high parameters. In addition, the groups classified based on risk factors for metabolic syndrome in adult single-person households using LCA were intense physical activity in early adulthood, hypertension in middle-aged female respondents, smoking and drinking in middle-aged male respondents, and obesity and abdominal obesity in middle-aged male respondents. In addition, when confirming the difference between potential class groups according to the factors influencing metabolic syndrome, the 4 potential classes showed substantial differences in general characteristics such as education level, income level, frequency of dining out, dietary life, subjective health status, and subjective body shape recognition. In addition, when examining the prediction of the occurrence of metabolic syndrome for each group, it was found that the obesity and abdominal obesity in middle-aged male respondents group had the highest probability, indicating that it was the most susceptible high-risk group in terms of the occurrence of metabolic syndrome.
This study is meaningful as a new attempt to identify the factors influencing metabolic syndrome in adult single-person households by applying machine learning techniques, categorize risk factors for metabolic syndrome using LCA, and identify the characteristics of each latent class. Therefore, this study provides new knowledge and contributes to the prevention of metabolic syndrome in adult single-person households by identifying 4 latent classes through LCA and thus facilitating the development of customized interventions.
Acknowledgments
This study was supported by a National Research Foundation of Korea grant funded by the Korean government (Ministry of Science and ICT; 2021R1A2C2095271).
Data Availability
The data sets generated and analyzed during this study are available from the corresponding author upon reasonable request.
Authors' Contributions
All authors had full access to all the data and take responsibility for the integrity of the data and accuracy of the data analysis. JSL and SKL contributed to the conceptualization and design of the study. JSL and SKL contributed to the acquisition and statistical analysis of the data. JSL and SKL contributed to the interpretation of the data and drafting of the manuscript. SKL critically revised the manuscript for important intellectual content, obtained funding, and supervised the study. SKL provided administrative, technical, and material support.
Conflicts of Interest
None declared.
References
- Zheng X, Yu H, Qiu X, Chair SY, Wong EM, Wang Q. The effects of a nurse-led lifestyle intervention program on cardiovascular risk, self-efficacy and health promoting behaviours among patients with metabolic syndrome: randomized controlled trial. Int J Nurs Stud. Sep 2020;109:103638. [FREE Full text] [CrossRef] [Medline]
- Fujiki H. The use of noncash payment methods for regular payments and the household demand for cash: evidence from Japan. Jpn Econ Rev. Jul 31, 2020;71(4):719-765. [CrossRef]
- Lee HN. The social service needs of single-person households and their policy implications. Health Welfare Pol Forum. Oct 01, 2020;288:21-35. [FREE Full text]
- Kawano T, Moriki G, Bono S, Kaji N, Jung H. Effects of household composition on health-related quality of life among the japanese middle-aged and elderly: analysis from a gender perspective. Jpn J Soc Welfare. 2020;60(5):1-12. [FREE Full text] [CrossRef]
- Kang SH, Park JY. Factors affecting the life satisfaction of unmarried one-person households according to marital experience. J Fam Res Manag Policy Rev. 2020;24:21-39. [CrossRef]
- Rantanen AT, Korkeila JJ, Kautiainen H, Korhonen PE. Non-melancholic depressive symptoms increase risk for incident cardiovascular disease: a prospective study in a primary care population at risk for cardiovascular disease and type 2 diabetes. J Psychosom Res. Feb 2020;129:109887. [CrossRef] [Medline]
- Ahn JH, Park YK. Frequency of eating alone and health related outcomes in Korean adults: based on the 2016 Korea National Health and Nutrition Examination Survey. J Korean Diet Assoc. 2020;26(2):85-100. [CrossRef]
- Mi-ree B, Min B, Minjin P. An empirical analysis of the spatial distribution and flow patterns of Seoul’s single-person households. J Popul Stud. 2019;42:91-119. [CrossRef]
- Lee YB. One-person households and their policy implications. Health Welf Policy Forum. 2017;252(1):64-77.
- Smiley A, King D, Bidulescu A. The association between sleep duration and metabolic syndrome: the NHANES 2013/2014. Nutrients. Oct 26, 2019;11(11):2582. [FREE Full text] [CrossRef] [Medline]
- Rogers JM. Smoking and pregnancy: epigenetics and developmental origins of the metabolic syndrome. Birth Defects Res. Oct 15, 2019;111(17):1259-1269. [FREE Full text] [CrossRef] [Medline]
- Gossett LK, Johnson HM, Piper ME, Fiore MC, Baker TB, Stein JH. Smoking intensity and lipoprotein abnormalities in active smokers. J Clin Lipidol. Dec 2009;3(6):372-378. [FREE Full text] [CrossRef] [Medline]
- Shin M. Comparative study on health behavior and mental health between one person and multi-person households : analysis of data from the national health and nutrition examination surveys(2013, 2015, 2017). J Korean Soc Wellness. Nov 30, 2019;14:11-23. [CrossRef]
- Detournay B, Fagnani F, Phillippo M, Pribil C, Charles MA, Sermet C, et al. Obesity morbidity and health care costs in France: an analysis of the 1991-1992 Medical Care Household Survey. Int J Obes Relat Metab Disord. Feb 3, 2000;24(2):151-155. [CrossRef] [Medline]
- Lin CH, Chiang SL, Heitkemper MM, Hung YJ, Lee MS, Tzeng WC, et al. Effects of telephone-based motivational interviewing in lifestyle modification program on reducing metabolic risks in middle-aged and older women with metabolic syndrome: a randomized controlled trial. Int J Nurs Stud. Aug 2016;60:12-23. [CrossRef] [Medline]
- Kurl S, Laukkanen JA, Niskanen L, Laaksonen D, Sivenius J, Nyyssönen K, et al. Metabolic syndrome and the risk of stroke in middle-aged men. Stroke. Mar 2006;37(3):806-811. [CrossRef] [Medline]
- Kim MH, Lee SH, Shin KS, Son DY, Kim SH, Joe H, et al. The change of metabolic syndrome prevalence and its risk factors in Korean adults for decade: Korea national health and nutrition examination survey for 2008–2017. Korean J Fam Pract. Feb 20, 2020;10(1):44-52. [CrossRef]
- Harrison S, Couture P, Lamarche B. Diet quality, saturated fat and metabolic syndrome. Nutrients. Oct 22, 2020;12(11):3232. [FREE Full text] [CrossRef] [Medline]
- Akhlaghi M. Dietary Approaches to Stop Hypertension (DASH): potential mechanisms of action against risk factors of the metabolic syndrome. Nutr Res Rev. Jul 30, 2019;33(1):1-18. [CrossRef]
- Bansal R, Gubbi S, Muniyappa R. Metabolic syndrome and COVID 19: endocrine-immune-vascular interactions shapes clinical course. Endocrinology. Oct 01, 2020;161(10):bqaa112. [FREE Full text] [CrossRef] [Medline]
- Ezzati M, Hoorn SV, Rodgers A, Lopez AD, Mathers CD, Murray CJ, et al. Comparative Risk Assessment Collaborating Group. Estimates of global and regional potential health gains from reducing multiple major risk factors. Lancet. Jul 26, 2003;362(9380):271-280. [CrossRef] [Medline]
- Muthen B, Muthen LK. Integrating person-centered and variable-centered analyses: growth mixture modeling with latent trajectory classes. Alcohol Clin Exp Res. Jun 2000;24(6):882-891. [CrossRef]
- Collins LM, Wugalter SE. Latent class models for stage-sequential dynamic latent variables. Multivar Behav Res. Jan 1992;27(1):131-157. [CrossRef]
- Nylund K, Bellmore A, Nishina A, Graham S. Subtypes, severity, and structural stability of peer victimization: what does latent class analysis say? Child Dev. Nov 2007;78(6):1706-1722. [CrossRef] [Medline]
- Kaptein KI, de Jonge P, van den Brink RH, Korf J. Course of depressive symptoms after myocardial infarction and cardiac prognosis: a latent class analysis. Psychosom Med. 2006;68(5):662-668. [CrossRef] [Medline]
- Locklear LR, Taylor SG, Ambrose ML. How a gratitude intervention influences workplace mistreatment: a multiple mediation model. J Appl Psychol. Sep 17, 2020;106(9):1314-1331. [CrossRef] [Medline]
- Alyass A, Turcotte M, Meyre D. From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med Genomics. Jun 27, 2015;8(1):33. [FREE Full text] [CrossRef] [Medline]
- Wang J, Chen H, Wang H, Liu W, Peng D, Zhao Q, et al. A risk prediction model for physical restraints among older Chinese adults in long-term care facilities: machine learning study. J Med Internet Res. Apr 06, 2023;25:e43815. [FREE Full text] [CrossRef] [Medline]
- Dong Y, Yeo MC, Tham XC, Danuaji R, Nguyen TH, Sharma AK, et al. Investigating psychological differences between nurses and other health care workers from the Asia-Pacific region during the early phase of COVID-19: machine learning approach. JMIR Nurs. Jun 01, 2022;5(1):e32647. [FREE Full text] [CrossRef] [Medline]
- Xie W, Ji M, Liu Y, Hao T, Chow CY. Predicting writing styles of web-based materials for children's health education using the selection of semantic features: machine learning approach. JMIR Med Inform. Jul 22, 2021;9(7):e30115. [FREE Full text] [CrossRef] [Medline]
- Expert Panel on Detection‚ Evaluation‚Treatment of High Blood Cholesterol in Adults. Executive summary of the third report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (Adult Treatment Panel III). JAMA. May 16, 2001;285(19):2486-2497. [CrossRef] [Medline]
- Kim WJ, Lee SY. A latent class analysis and predictors of chronic diseases -based on 2014 Korea National Health and Nutrition Examination Survey. J Korea Acad Ind Coop Soc. Jun 30, 2018;19(6):324-333. [CrossRef]
- West SL, Bates H, Watson J, Brenner IK. Discriminating metabolic health status in a cohort of nursing students: protocol for a cross-sectional study. JMIR Res Protoc. Aug 28, 2020;9(8):e21342. [FREE Full text] [CrossRef] [Medline]
- Tylutka A, Morawin B, Walas Ł, Michałek M, Gwara A, Zembron-Lacny A. Assessment of metabolic syndrome predictors in relation to inflammation and visceral fat tissue in older adults. Sci Rep. Jan 03, 2023;13(1):89. [FREE Full text] [CrossRef] [Medline]
- Andica C, Kamagata K, Takabayashi K, Kikuta J, Kaga H, Someya Y, et al. Neuroimaging findings related to glymphatic system alterations in older adults with metabolic syndrome. Neurobiol Dis. Feb 2023;177:105990. [FREE Full text] [CrossRef] [Medline]
- Ntougou Assoumou HG, Pichot V, Barthelemy JC, Celle S, Garcin A, Thomas T, et al. Obesity related to metabolic syndrome: comparison of obesity indicators in an older french population. Diabetol Metab Syndr. May 11, 2023;15(1):98. [FREE Full text] [CrossRef] [Medline]
- Lakka HM, Laaksonen DE, Lakka TA, Niskanen LK, Kumpusalo E, Tuomilehto J, et al. The metabolic syndrome and total and cardiovascular disease mortality in middle-aged men. JAMA. Dec 04, 2002;288(21):2709-2716. [CrossRef] [Medline]
- Amadou C, Heude B, de Lauzon-Guillain B, Lioret S, Descarpentrie A, Ribet C, et al. Early origins of metabolic and overall health in young adults: an outcome-wide analysis in a general cohort population. Diabetes Metab. Mar 2023;49(2):101414. [CrossRef] [Medline]
- Lee JS, Kang MA, Lee SK. Effects of the e-Motivate4Change program on metabolic syndrome in young adults using health apps and wearable devices: quasi-experimental study. J Med Internet Res. Jul 30, 2020;22(7):e17031. [FREE Full text] [CrossRef] [Medline]
- Lee SK, Moon M. Factors influencing metabolic syndrome perception and exercising behaviors in Korean adults: data mining approach. J Korea Acad Ind Coop Soc. Dec 31, 2017;18(12):581-588. [CrossRef]
- Kang JS, Kang HS, Jeong Y. A web-based health promotion program for patients with metabolic syndrome. Asian Nurs Res (Korean Soc Nurs Sci). Mar 2014;8(1):82-89. [FREE Full text] [CrossRef] [Medline]
- Petrella RJ, Stuckey MI, Shapiro S, Gill DP. Mobile health, exercise and metabolic risk: a randomized controlled trial. BMC Public Health. Oct 18, 2014;14:1082. [FREE Full text] [CrossRef] [Medline]
- Chen SM, Creedy D, Lin HS, Wollin J. Effects of motivational interviewing intervention on self-management, psychological and glycemic outcomes in type 2 diabetes: a randomized controlled trial. Int J Nurs Stud. Jun 2012;49(6):637-644. [CrossRef] [Medline]
- Wang HJ, Shi LZ, Liu CF, Liu SM, Shi ST. Association between uric acid and metabolic syndrome in elderly women. Open Med (Wars). 2018;13:172-177. [FREE Full text] [CrossRef] [Medline]
- Jung CH, Park JS, Lee WY, Kim SW. Effects of smoking, alcohol, exercise, level of education, and family history on the metabolic syndrome in Korean adults. Korean J Med. Dec 2002;63(6):649-659. [FREE Full text]
- Oh JE. Association between smoking status and metabolic syndrome in men. Korean J Obes. 2014;23(2):99-105. [CrossRef]
- Zhang B, Pan B, Zhao X, Fu Y, Li X, Yang A, et al. The interaction effects of smoking and polycyclic aromatic hydrocarbons exposure on the prevalence of metabolic syndrome in coke oven workers. Chemosphere. May 2020;247:125880. [CrossRef] [Medline]
Abbreviations
DT: decision tree |
LCA: latent class analysis |
LR: logistic regression |
NHANES: National Health and Nutrition Examination Survey |
RF: random forest |
XGBoost: extreme gradient boost |
Edited by A Mavragani; submitted 16.09.22; peer-reviewed by S Sharma, K Gupta, F Gomez; comments to author 05.04.23; revised version received 24.04.23; accepted 17.08.23; published 12.09.23.
Copyright©Ji-Soo Lee, Soo-Kyoung Lee. Originally published in JMIR Formative Research (https://formative.jmir.org), 12.09.2023.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.