Published on in Vol 7 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/42756, first published .
Identification of Risk Groups for and Factors Affecting Metabolic Syndrome in South Korean Single-Person Households Using Latent Class Analysis and Machine Learning Techniques: Secondary Analysis Study

Identification of Risk Groups for and Factors Affecting Metabolic Syndrome in South Korean Single-Person Households Using Latent Class Analysis and Machine Learning Techniques: Secondary Analysis Study

Identification of Risk Groups for and Factors Affecting Metabolic Syndrome in South Korean Single-Person Households Using Latent Class Analysis and Machine Learning Techniques: Secondary Analysis Study

Authors of this article:

Ji-Soo Lee 1 Author Orcid Image ;   Soo-Kyoung Lee 2 Author Orcid Image

Original Paper

1Department of Nursing, Gimcheon University, Gimcheon-si, Republic of Korea

2Big Data Convergence and Open Sharing System, Seoul National University, Seoul, Republic of Korea

*all authors contributed equally

Corresponding Author:

Soo-Kyoung Lee, RN, PhD

Big Data Convergence and Open Sharing System

Seoul National University

1 Gwanak-ro

Gwanak-gu

Seoul, 08826

Republic of Korea

Phone: 82 2 889 5710

Fax:82 2 889 5711

Email: soo1005s@gmail.com


Background: The rapid increase of single-person households in South Korea is leading to an increase in the incidence of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases, due to lifestyle changes. It is necessary to analyze the complex effects of metabolic syndrome risk factors in South Korean single-person households, which differ from one household to another, considering the diversity of single-person households.

Objective: This study aimed to identify the factors affecting metabolic syndrome in single-person households using machine learning techniques and categorically characterize the risk factors through latent class analysis (LCA).

Methods: This cross-sectional study included 10-year secondary data obtained from the National Health and Nutrition Examination Survey (2009-2018). We selected 1371 participants belonging to single-person households. Data were analyzed using SPSS (version 25.0; IBM Corp), Mplus (version 8.0; Muthen & Muthen), and Python (version 3.0; Plone & Python). We applied 4 machine learning algorithms (logistic regression, decision tree, random forest, and extreme gradient boost) to identify important factors and then applied LCA to categorize the risk groups of metabolic syndromes in single-person households.

Results: Through LCA, participants were classified into 4 groups (group 1: intense physical activity in early adulthood, group 2: hypertension among middle-aged female respondents, group 3: smoking and drinking among middle-aged male respondents, and group 4: obesity and abdominal obesity among middle-aged respondents). In addition, age, BMI, obesity, subjective body shape recognition, alcohol consumption, smoking, binge drinking frequency, and job type were investigated as common factors that affect metabolic syndrome in single-person households through machine learning techniques. Group 4 was the most susceptible and at-risk group for metabolic syndrome (odds ratio 17.67, 95% CI 14.5-25.3; P<.001), and obesity and abdominal obesity were the most influential risk factors for metabolic syndrome.

Conclusions: This study identified risk groups and factors affecting metabolic syndrome in single-person households through machine learning techniques and LCA. Through these findings, customized interventions for each generational risk factor for metabolic syndrome can be implemented, leading to the prevention of metabolic syndrome, which causes cardiovascular and cerebrovascular diseases. In conclusion, this study contributes to the prevention of metabolic syndrome in single-person households by providing new insights and priority groups for the development of customized interventions using classification.

JMIR Form Res 2023;7:e42756

doi:10.2196/42756

Keywords



Background

Single-person households have rapidly increased from 9% in 1990 to 29.3% in 2018, accounting for one-third of all South Korean households [1], and are estimated to reach approximately 36.3% by 2045 [2]. This increasing trend is also evident worldwide, including the United States (26.7%), Australia (23.9%), and Japan (32.4%) [3,4].

The reasons for this rising trend include the large number of unmarried people and late marriages, resulting in changes in marital values, divorce, separation, high unemployment, and diverse and complex social factors in larger cities [5]. On the basis of individuals’ sociodemographic characteristics and lifestyle, single-person households are more susceptible to exposure to high-risk health behaviors, such as smoking and alcohol consumption, as well as experiences of depression and stress, than multiperson households [6-8].

Adult single-person households are known to show distinct differences from multiperson households in terms of demographic characteristics and living habits. For instance, it has been reported that single-person households are more likely than multiperson households to be more susceptible to health problems [9-11]. In addition, compared with multiperson households, single-person households are more exposed to high-risk health behaviors, such as smoking and drinking, and experience more depression and stress [12,13].

These sociodemographic characteristics and lifestyles indicate that single-person households have a higher prevalence of metabolic syndrome and chronic diseases, such as hypertension, diabetes, dyslipidemia, arthritis, asthma, myocardial infarction, and stroke [14-16].

Metabolic syndrome leads to cardiovascular disease and a risk of diabetes [6], involving at least 3 clinical characteristics, namely hypertension, hyperglycemia, and hypertriglyceridemia, and high levels of low-density lipoprotein, as well as to abdominal obesity [6,15]. It also increases the occurrence of myocardial infarction, stroke, and dementia [1,6,11,17]; therefore, it is important to decrease the incidence of metabolic syndrome to prevent chronic cardiac and cerebrovascular diseases and reduce the mortality rate [18,19].

It is also necessary to assess the morbidity associated with the disease and develop customized medications and guidelines to manage its risk factors [20]. Previous studies have demonstrated that risk factors include age, sex, obesity, smoking, a lack of physical activity, and education [4-7] Although single-person households include various characteristics, their influences on metabolic syndrome may differ from those of multiperson households and across age groups [1,3]. This necessitates a more holistic and systematic understanding of the metabolic syndrome risk factors in single-person households [1], as each risk factor may have a discriminatory or an interrelated effect on metabolic syndrome depending on individual characteristics [21].

Latent class analysis (LCA), a human-centered approach, checks the multidimensional characteristics of human behavior; it contrasts with a conventional variable-centered approach, which describes predictors’ relative influence on outcome variables [22-24]. In addition, identifying the patient type and characteristics is advantageous in predicting the disease, and a customized intervention program can be planned according to individual risk factor vulnerabilities and diagnosis [25-27]. Machine learning refers to a method of automatically extracting general rules or new knowledge by implementing learning ability, one of the unique intelligence functions of humans, through machines and analyzing the given data [28,29]. In this study, the factors affecting metabolic syndrome in South Korean single-person households were analyzed using logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boost (XGBoost). LR, DT, and RF are the most commonly used machine learning techniques, and XGBoost is a machine learning technique that has recently emerged [27-29].

This study aimed to identify the factors affecting metabolic syndrome in single-person households using machine learning with large-scale health data from the National Health and Nutrition Examination Survey (NHANES) [30]. However, few studies have applied machine learning and LCA to identify the factors affecting metabolic syndrome in single-person households [23,24,30]. The contribution or significance of this study is not finding any exact answer but finding new variables or overlooked parts through basic research or translational research for clinical application. The core value of translational research lies in its effort to apply basic research to clinical practice with a high success rate at a low cost in a short period.

Hence, this study was designed to establish basic data to develop customized interventions by categorizing and characterizing metabolic syndrome risk factors in South Korean single-person households using machine learning techniques and LCA.

Purpose of This Study

This study used data from the NHANES spanning 10 years (2009-2018), applied machine learning techniques to identify the factors that affect the occurrence of metabolic syndrome, and applied LCA to classify single-person households. The purpose of this study was to categorize risk groups and identify risk factors for metabolic syndrome in South Korean single-person households.


Research Design

This study was a secondary data analysis that used machine learning techniques and LCA to categorize metabolic syndrome risk factors to identify the factors influencing the occurrence of metabolic syndrome in single-person households. The overall flowchart of the study is shown in Figure 1.

Figure 1. Overall flowchart of this study. DT: decision tree; LR: logistic regression; NHANES: National Health and Nutrition Examination Survey; RF: random forest; XGBoost: extreme gradient boost.

Participants of the Study

This study used raw data from the 10-year NHANES (2009-2018) conducted by the Ministry of Health and Welfare and the Korea Centers for Disease Control and Prevention for a secondary data analysis. The South Korean NHANES generated data representative of the South Korean population using stratified colony sampling. The total number of respondents was 83,294, among whom there were 1376 (1.65%) single-person households, and 79,717 (95.71%) households with ≥2 persons. Of the 1376 single-person households, 1371 (99.64%) were finally selected as study participants, excluding 5 households because of missing data and older age.

Data Set

We used the health questionnaire from the NHANES’s fourth (2009), fifth (2010-2012), sixth (2013-2015), and seventh terms (2016-2018).

General Characteristics

We selected participants with the following characteristics: sex (male or female), age (early adulthood, ie, 19-39 y of age, and middle adulthood, ie, 40-64 y of age), educational level (lower than high school to higher than an undergraduate [4-year] college degree), marital status (married or unmarried), income level, and economic activity status (active or inactive).

Health Behavior

We selected smoking behavior (smoking or nonsmoking); alcohol consumption (abstaining, <4 times/mo, or >2 times/wk); exercise, such as walking (<3 times/wk or >3 times/wk); subjective recognition of body type (very thin, slightly thin, normal, slightly obese, or very obese); subjective health status (very good, good, normal, bad, or very bad); and obesity status indicated by BMI (<18.5=underweight, 18.5-22.9=normal, 23-24.9=at risk, or ≥25=obese).

Eating Habits

We administered a questionnaire to determine how frequently respondents dined out (5 times/wk, 1-4 times/wk, and <3 times/mo) and their dietary lifestyle (“good” or “bad”).

Mental Health

We assessed respondents’ awareness of stress (recognition or nonrecognition) and diagnoses of depression (diagnosed or undiagnosed).

Use of Medical Institutions and Community Services

We classified participants based on health, cancer, and oral cavity using “yes” or “no” responses and included the type of health insurance (local, employment-related, or uninsured or self-paying medical care) and subscription to private medical insurance (registered or unregistered for private medical insurance).

Metabolic Syndrome

We determined the presence of metabolic syndrome based on the National Cholesterol Education Program–Adult Treatment Panel 3 diagnostic criteria [31] and whether respondents possessed ≥3 of the following 5 criteria: hypertension, hyperglycemia, hypertriglyceridemia, hypo–high-density lipoproteinemia, and abdominal obesity. Waist circumference, triglycerides, high-density lipoprotein cholesterol levels, final systolic and diastolic blood pressures (mean of the second and third measurements), and fasting blood glucose level were used to determine the existence of metabolic syndrome.

Data Collection Method

We submitted our affiliation and purpose of using the data to the Korea Disease Control and Prevention Agency’s data portal and then used the data, which contained no personal information.

Data Preprocessing

After sampling and merging the 10-year data from the NHANES, we conducted a data-cleansing process, and the distribution of variables was confirmed using the missing values function of the SPSS software (version 25.0; IBM Corp) to identify both ideal values and missing data [32].

In this study, data from a total of 83,294 individuals who participated in the 10-year (2009-2018) NHANES and application year survey were extracted. After extracting cases where the number of households (code name=cfam) was “1,” out of 83,294 households, we found 3577 (4.29%) single-person households from 2009 to 2018. Of these 3577 individuals from single-person households, 1371 (38.33%) were finally selected after excluding older adults (aged ≥65 y) and those with missing values.

After extracting 10 years of data from the NHANES, this study went through a lightweight process, and to check outliers and missing values in the data, the missing value program of SPSS was used to check the weight of the group. A total of 1182 cases were finalized, processed, and deleted to confirm the initial and intermediate defects applied in the overlapping files of 10 years of data from the NHANES. For the analysis, age, a continuous variable, was converted into a categorical variable, and a metabolic syndrome variable was newly created in the case of having at least 3 of hypertension, hyperglycemia, high-density lipoproteinemia, hypertriglyceridemia, and abdominal obesity. The case of having 3 or more of each currency was made a reimbursement syndrome. Metabolic syndrome was analyzed according to the National Cholesterol Education Program–Adult Treatment Panel 3 diagnostic criteria [31].

In this study, when the influencing factors of the syndrome were analyzed by applying LR, DT, RF, and XGBoost among machine learning methods, the total number of discussions of the 10-year data from the NHANES was 7450. From 2009 to 2018, there were 390 results of splitting the data using the 10-fold cross-validation method. Among them, 154 items that accumulated drainage, 5 diagnostic criteria for metabolic syndrome unrelated to measurements, and study participants were analyzed as factors influencing the occurrence of metabolic syndrome in single-person households based on the code name MetS (metabolic syndrome or not).

We applied LR, DT, RF, and XGBoost algorithms among machine learning techniques with a total of 7450 variables of the 10-year NHANES data to analyze the influencing factors of metabolic syndrome.

Statistical Analysis

Data were analyzed using SPSS, Mplus (version 8.0; Muthen & Muthen), and Python (version 3.0; Plone & Python).

Ethical Considerations

We performed data analysis after obtaining approval from Keimyung University’s ethics committee for an exemption from deliberation (institutional review board number 40525-202008-HR-043-01) because we used existing data or published documents instead of directly engaging with participants.


Respondents’ General Characteristics

Of the 1371 respondents, 681 (49.67%) were male, 893 (65.13%) were middle-aged adults, and 807 (58.86%) had less than a high school education. Further, among the 1371 respondents, 990 (72.21%) had active economic activity, and 384 (28.01%) had low or intermediate income levels. Of the 1371 respondents, 705 (51.42%) were married, 932 (67.98%) were nonsmokers, and 749 (54.63%) consumed alcohol <4 times a month. Moreover, of the 1371 respondents, 938 (68.42%) walked >3 times a week, 518 (37.78%) considered themselves to have a normal body weight, 668 (48.72%) were subjectively healthy, and 541 (39.46%) recognized their subjective body type.

In addition, of the 1371 respondents, 602 (43.91%) and 778 (56.75%) respondents indicated that their father and mother had an elementary school education, respectively. Regarding mental health, 930 (67.83%) of the 1371 respondents were not aware of stress, and 1247 (90.96%) of the 1371 respondents were not diagnosed with depression (Table 1).

Table 1. General characteristics of the study participants (N=1371).
VariableParticipant, n (%)
Sex

Male681 (49.67)

Female690 (50.33)
Age (years)

Early adulthood (19-39)478 (34.87)

Middle adulthood (40-64)893 (65.13)
Educational level

≤High school807 (58.86)

≥College564 (41.14)
Economic activity status

Active990 (72.21)

Inactive381 (27.79)
Income level

Low340 (24.8)

Lower intermediate384 (28.01)

Upper intermediate326 (23.78)

Advanced321 (23.41)
Marital status

Married705 (51.42)

Single666 (48.58)
Smoking

Smoker439 (32.02)

Nonsmoker932 (67.98)
Frequency of drinking

None260 (18.96)

<4 times/mo749 (54.63)

>2 times/wk362 (26.4)
Days of walking

<3 times/wk433 (31.58)

>3 times/wk938 (68.42)
Obesity status

Underweight58 (4.23)

Normal518 (37.78)

Overweight339 (24.73)

Obese456 (33.26)
Subjective health status

Very healthy59 (4.3)

Healthy285 (20.79)

Normal668 (48.72)

Unhealthy286 (20.86)

Very unhealthy73 (5.32)
Subjective body shape recognition

Very thin41 (2.99)

Little thin162 (11.82)

Normal541 (39.46)

Little obese478 (34.87)

Very obese149 (10.87)
Recognition status of stress

No recognition930 (67.83)
Recognition441 (32.17)
Depression diagnosis by physician

Negative1247 (90.96)

Positive124 (9.04)
Dietary condition

Good1267 (92.41)

Poor104 (7.59)
Frequency of eating out

2 times/d->5 times/wk731 (53.32)

1 time/wk-4 times/wk398 (29.03)

<3 times/mo242 (17.65)
Father’s educational level

Elementary school graduate602 (43.91)

Middle school graduate204 (14.88)

High school graduate335 (24.43)

College graduate230 (16.78)
Mother’s educational level

Elementary school graduate778 (56.75)

Middle school graduate197 (14.37)

High school graduate290 (21.15)

College graduate106 (7.73)
Health checkup status

Yes877 (63.97)

No494 (36.03)
Cancer checkup status

Yes633 (46.17)

No738 (53.83)
Oral examination

Yes898 (65.5)

No473 (34.5)
Type of health insurance

Regional health insurance491 (35.81)

Company health insurance757 (55.22)

Medical care123 (8.97)
Private health insurance

Joined1066 (77.75)

Not joined305 (22.25)

Analysis of the Factors Influencing Metabolic Syndrome Using Machine Learning Techniques

We observed 390 common variables from 10 years of merged data (2009-2018) of the NHANES. Among them, 154 were excluded because they did not comply with the study, and 236 missing variables were analyzed to assess the factors affecting metabolic syndrome in single-person households.

Overall, 4 algorithms were applied in the analysis: LR, DT, RF, and XGBoost. The importance of the variables age, BMI, and subjective recognition of body type as extracted from LR was 212.56, 173.26, and 138.01, respectively. Furthermore, the importance of the variables BMI and dietary condition as extracted from DT was 35.50, 7.07, and 5.53, respectively. The importance of the variables BMI, obesity, and age as extracted from RF was 7.07, 2.99, and 2.80, respectively. Finally, the importance of the variables status of drinking, weight control, and age as extracted from XGBoost was 6.34, 5.81, and 3.06, respectively (Table 2). To summarize, we found age, BMI, obesity, and the subjective recognition of body type to be the most important common variables.

Table 2. Analysis of the factors influencing metabolic syndrome using machine learning techniques.
Algorithm and variable nameFeature importance
Logistic regression

Age212.56

BMI173.26

Subjective body shape recognition138.01

Amount of alcohol consumption at a time87.98

Type of longest job72.51

Subjective health status71.79

Diagnosis of osteoarthritis68.84

Diagnosis of arthritis62.79

Daily activities60.61

Status of other cancer treatments57.59
Decision tree

Age30.50

BMI7.07

Dietary conditions5.53

Duration of walking2.00

Age at which alcohol consumption began1.99

Modified working hours1.37

Status of smoking1.31

Frequency of binge drinking1.15

Frequency of eating out0.89

Region0.74
Random forest

BMI7.08

Obesity status2.99

Age2.80

Subjective body shape recognition2.67

Type of longest job1.14

Region1.22

Age at which alcohol consumption began1.11

Standard occupation classification0.81

Education level1.02

Walking days in a week0.78
Extreme gradient boost

Status of drinking6.34

Weight control method5.81

Driving under the influence during 1 y3.06

Smoking cessation plan1.91

Duration of disease state1.64

Self-management1.60

Age1.60

Status of nutrition display impact1.56

Daily activities1.44

Status of pulmonary tuberculosis diagnosis1.18

Determining the Number of Latent Class Layers

LCA was used to determine 4 indices of the model’s goodness of fit: Bayesian information criteria, sample size–adjusted Bayesian information criteria, Lo-Mendell-Rubin adjusted likelihood ratio test, and bootstrapped likelihood ratio test. We determined the number of class layers through a preferential check of each measured model’s goodness-of-fit index. In particular, we increased the number of layers, as illustrated in Table 3, and used several influencing factors to reveal the presence of metabolic syndrome in single-person households, finally deciding on 4 latent classes.

Table 3. Model fit indices for the latent class analysis model (N=1371).
Group numberModel fit indicesClassification of latent class, n (%)

BICaSSABICbBLRTcLMRd12345
120,874.1820,782.180.0000.000508 (37.05)863 (62.95)N/AeN/AN/A
22050.9120,366.140.0000.000655 (47.78)298 (21.74)418 (30.49)N/AN/A
320,276.9920,089.570.0000.000354 (25.82)329 (24)320 (23.34)368 (26.8)N/A
420,266.9820,087.230.5190.506310 (22.61)186 (13.57)319 (23.27)306 (22.32)250 (18.23)

aBIC: Bayesian information criteria.

bBLRT: bootstrapped likelihood ratio test.

cLMR: Lo-Mendell-Rubin adjusted likelihood ratio test.

dSSABIC: sample size–adjusted Bayesian information criteria.

eN/A: not applicable.

Names and Characteristics of the Latent Classes

It is important to select latent class classification variables to identify the factors affecting metabolic syndrome in single-person households through an in-depth consideration of prior research results [22,23].

Therefore, to diagnose metabolic syndrome, we selected sex, age, smoking, alcohol consumption, walking, obesity, hypertriglyceridemia, high blood pressure, high blood glucose, abdominal obesity, and hypo–high-density lipoproteinemia. On the basis of the characteristics and response patterns of subclass types classified through the LCA, we named these categorized classes as follows: group 1: intense physical activity in early adulthood, group 2: hypertension among middle-aged female respondents, group 3: smoking and drinking among middle-aged male respondents, and group 4: obesity and abdominal obesity among middle-aged male respondents. Tables 4 and 5 present the characteristics and names of each sublayer type according to each latent class.

From the 1371 participants, groups 1, 2, 3, and 4 had 320 (23.34%), 368 (26.84%), 329 (24%), and 354 (25.82%) participants, respectively. First, group 1 was compared with the other 3 groups, with 300 (93.8%) of the 320 participants indicating that age was the most important factor. Moreover, 289 (90.3%) respondents walked >3 times a week, which was substantially higher than that of the other groups. All the 5 diagnostic criteria for metabolic syndrome exhibited low rates, regardless of whether metabolic syndrome was present at 0%. In group 2, out of 368 respondents, 337 (91.6%) were female, and all participants in this group were in their middle adulthood. All the diagnostic criteria for metabolic syndrome exhibited low rates, whereas 47 (12.8%) participants had metabolic syndrome. In group 3, out of 329 respondents, 318 (96.7%) were male, which is more than the number of male respondents in other groups, and 250 (76%) respondents in this group were in their middle adulthood. The rate of smoking was high (n=249, 75.7%), and 181 (55%) participants reported a high frequency of alcohol consumption (>2 times/wk). In addition, 72 (21.9%) respondents had metabolic syndrome. In group 4, out of 354 participants, 255 (72%) participants were in middle adulthood. In terms of the diagnostic criteria for metabolic syndrome, 265 (74.9%) had hypertension, 306 (86.4%) were obese, 354 (100%) had abdominal obesity, and 232 (65.5%) had metabolic syndrome.

Table 4. Baseline characteristics according to latent class analysis–derived classes (N=1371).
VariableGroup 1 (n=320), n (%)Group 2 (n=368), n (%)Group 3 (n=329), n (%)Group 4 (n=354), n (%)
Sex

Male143 (44.7)31 (8.4)318 (96.7)189 (53.4)

Female177 (55.3)337 (91.6)11 (3.3)165 (46.6)
Age (years)

Early adulthood (19-39)300 (93.8)0 (0)79 (24)99 (28)

Middle adulthood (40-64)20 (6.2)368 (100)250 (76)255 (72)
Current smoking status

Smoking70 (21.9)12 (3.3)249 (75.7)128 (36.2)

Not smoking250 (78.1)356 (96.7)80 (24.3)226 (63.8)
Frequency of drinking

No drinking18 (5.6)132 (35.9)34 (10.3)76 (21.5)

<4 times/mo240 (75)212 (57.6)114 (34.7)183 (51.7)

>2 times/wk62 (19.4)24 (6.5)181 (55)95 (26.8)
Frequency of walking days

<3 times/wk31 (9.7)122 (33.2)107 (32.5)133 (37.6)

>3 times/wk289 (90.3)246 (66.8)222 (67.5)221 (62.4)
Obesity

Low weight32 (10)15 (4.1)11 (3.3)0 (0)

Normal197 (61.6)197 (53.5)124 (37.7)0 (0)

Overweight59 (18.4)112 (30.4)170 (51.7)48 (13.6)

Obese32 (10)44 (12)24 (7.3)306 (86.4)
Dyslipidemia

Yes22 (6.9)73 (19.8)171 (52)162 (45.8)

No298 (93.1)295 (80.2)158 (48)192 (54.2)
Hypertension

Yes17 (5.3)238 (64.7)164 (49.8)205 (57.9)

No303 (94.7)130 (35.3)165 (50.2)149 (42.1)
Hyperlipidemia

Yes8 (2.5)90 (24.5)166 (50.5)175 (49.4)

No312 (97.5)278 (75.5)163 (49.5)179 (50.6)
Abdominal obesity

Yes0 (0)9 (2.4)0 (0)354 (100)

No320 (100)359 (97.6)329 (100)0 (0)
Hypo–high-density lipoproteinemia

Yes53 (16.6)135 (36.7)73 (22.2)151 (42.7)

No267 (83.4)233 (63.3)256 (77.8)203 (57.3)
Table 5. Latent classes of metabolic syndrome in South Korean single-person households.
DivisionGroup nameParticipant (N=1371), n (%)Diagnosis of metabolic syndrome, n (%)
Group 1Intense physical activity in early adulthood320 (23.3)0 (0)a
Group 2Hypertension among middle-aged female respondents368 (26.8)47 (12.8)b
Group 3Smoking and drinking among middle-aged male respondents329 (24)72 (21.9)c
Group 4Obesity and abdominal obesity among middle-aged respondents354 (25.8)232 (65.5)d

an=320.

bn=368.

cn=329.

dn=354.

Relationships Between Latent Class Groups and Metabolic Syndrome

We performed a binary LR to predict metabolic syndrome outbreaks in the categorized latent class groups (Table 6).

Regression analysis of the groups, as classified by the LCA (independent variables) and occurrence of metabolic syndrome (dependent variable) was significant (χ23=521.7, P<.001). Further, the Cox and Snell coefficient of determination (R2=0.57), representing the model’s descriptive power, was 57%. The Hosmer-Lemeshow test results for the prediction model (χ²3=12.7, P=.49) demonstrated that no differences existed between the observed and predicted values.

The group with intense physical activity in middle adulthood was established as a reference category. In comparison, groups 2, 3, and 4 were 5.09 times (95% CI 3.15-14.91; P<.001), 8.99 times (95% CI 5.74-21.72; P<.001), and 17.67 times (95% CI 14.45-25.33; P<.001) more likely to experience metabolic syndrome, respectively.

Table 6. Relationship between latent groups and metabolic syndrome (N=1371)a.
GroupB (SE)Odds ratio (95% CI)P value
Intense physical activity in early adulthoodReferencebReferenceReference
Hypertension among middle-aged female respondents2.08 (0.22)5.09 (3.15-14.91).001
Smoking and drinking among middle-aged male respondents2.94 (0.29)8.99 (5.74-21.72).001
Obesity and abdominal obesity among middle-aged male respondents3.75 (0.30)17.67 (14.45-25.33).001

aR²=0.57, χ²3=12.7 in Hosmer-Lemeshow test; P=.49.

bSet as reference category in latent class analysis.


Principal Findings

This study is the first to identify risk factors for metabolic syndrome in South Korean single-person households from multiple angles using LCA and machine learning techniques. The purpose of this study was to classify the risk factors for metabolic syndrome in single-person households using LCA and to identify the types and characteristics of the classified latent class. This paper describes metabolic health (BMI, body weight, body fat percentage, blood pressure, and blood sugar) among the physical and social characteristics of single-person households. There were more single-person households in middle adulthood (40-64 y) than in early adulthood (19-39 y). In this study, age, BMI, obesity, drinking, and body shape were found as potential risk factors for metabolic syndrome in single-person households. A cross-sectional study such as this is necessary because it can identify the factors that affect metabolic syndrome in single-person households in South Korea and determine which factors should be targeted through appropriate intervention [33].

Existing studies on metabolic syndrome were conducted mainly among older and middle-aged adults [34-37]. Among recent studies, several studies have confirmed the presence metabolic syndrome in the younger generation, suggesting that the metabolic syndrome morbidity rate among generations with various characteristics has increased [37,38]. On the basis of this, it was found that the diversity of single-person households could not be overlooked. Importantly, it has been reported that health habits have substantial influence on metabolic syndrome [39]. As health habits are already fixed in middle to late adulthood, it is difficult to expect changes in health behavior later; therefore, the prevention and management of metabolic syndrome in early adulthood should be considered [38-40]. Therefore, it is evident that modifying health habits is the most important step in treating or preventing metabolic syndrome.

In this study, to categorize the risk factors for metabolic syndrome in adult single-person households, the LR, DT, RF, and XGBoost algorithms, which are machine learning techniques, were applied to identify factors that affect the occurrence of metabolic syndrome in adult single-person households. In this analysis, variables such as age, BMI, obesity, alcohol consumption, and subjective body shape recognition were commonly derived. This suggests that the factors identified in previous studies as affecting metabolic syndrome in adult single-person households and the factors identified by applying machine learning techniques in this study are consistent with each other [30]. It is important to actively encourage physical activity to prevent metabolic syndrome [39]. In addition, it is necessary to develop a differentiated health management strategy using mobile health programs for single-person households in early adulthood with sustainable and compelling content relevant to their daily lives.

Unlike group 1, group 2 comprised mostly female respondents, primarily in the center of middle adulthood or older. In addition, this group had low rates of smoking and obesity and a high rate of hypertension. These results were consistent with those of previous studies, which indicated that high blood pressure in middle adulthood causes metabolic syndrome [40]. In addition, the rates of normal weight and overweight were the highest and second highest, respectively, in this group, which is consistent with the study by Kang et al [41], which reported that physical activity reduces hypertension and prevents metabolic syndrome among female individuals. This finding suggests that high blood pressure is an important risk factor for developing metabolic syndrome in single-person households [42].

Hypertension was an important risk factor, as seen in group 2. Thus, to prevent metabolic syndrome in group 2, it is important to develop and implement intervention programs for reducing blood pressure through diet and exercise therapy programs, encourage physical activity, and reduce obesity [43,44].

In group 3, the proportion of male respondents was significantly higher. In addition, the rate of smoking, frequency of alcohol consumption, and the rate of obesity were the highest in this group compared with the other groups. Moreover, sex and age were important risk factors for metabolic syndrome, which is consistent with the large proportion of middle-aged respondents in group 3. This group also exhibited characteristics of typical middle-aged workers, indicating the need to observe and manage smoking and alcohol consumption, especially among office workers [45]. These findings coincide with the finding of the study by Oh [46] that smoking facilitates metabolic syndrome, whereas its cessation prevents it among middle-aged male individuals. Thus, alcohol consumption and smoking were important risk factors for metabolic syndrome in group 3.

In this group, 21.9% (72/329) of the participants developed metabolic syndrome, and this group was 8.99 times more likely to develop metabolic syndrome than group 1. This corroborates the findings of Oh [46], as those in middle adulthood are more likely to be exposed to hypertension, hyperlipidemia, smoking, and alcohol consumption; hence, this group requires close monitoring and preventive nursing interventions. Moreover, although stress often leads to a desire to smoke and compels ex-smokers to begin smoking again, it is not fully clear as to why it is difficult to cease smoking [45,46]. Therefore, nursing interventions are needed to increase the motivation to quit smoking.

Further, another study discovered that the greater the stress, the higher the risk of health problems, such as smoking and depression [6,25]. Higher nicotine dependence demonstrates that smoking may be an inappropriate response if psychological problems such as stress and depression are not properly managed [12,44,45]. In addition, as Korean populations are often exposed to smoking when dining together and drinking socially, it is necessary to establish a culture of smoking cessation and changes in dining manners.

In group 4, the proportions of male respondents and female respondents were similar, with a high proportion of respondents in middle adulthood. Further, all respondents in the group exhibited obesity (based on the respondents’ BMI) or abdominal obesity (based on the respondents’ waist circumference). Obesity is also associated with the development of insulin resistance and beta-cell dysfunction, regardless of whether it is accompanied by abdominal obesity, which is consistent with prior literature [37,42]. Our results are also consistent with a report by Detournay et al [14], which revealed that obesity and abdominal obesity during female menopause may cause metabolic syndrome.

In group 4, metabolic syndrome was prevalent among 65.5% (232/354) of the respondents, and this group was 17.67 times more likely to develop metabolic syndrome than group 1. Moreover, the rates of hypertension, hyperglycemia, abdominal obesity, and hypo–high-density lipoproteinemia were higher than those in the other groups. As having at least 3 of the 5 criteria is an important basis for diagnosing metabolic syndrome, this is a critical factor [47].

This study’s LCA demonstrated that heterogeneous subgroups exist depending on metabolic syndrome risk factors, which is different from the results of most previous studies that focused on specific metabolic syndrome risk factors. We have proven that certain risk factors may have more prominent effects and affect certain age groups more strongly. Moreover, obesity and abdominal obesity were the most influential risk factors for metabolic syndrome in single-person households.

A national policy to promote physical activity is needed to prevent and manage metabolic syndrome in single-person households. In addition, strategies are needed to develop intervention programs for enhancing physical activity at any time or anywhere through mobile health and wearable devices; such programs would naturally integrate physical activity into daily life. Thus, it would be much more effective to develop and implement different risk-based intervention strategies for different individuals. It would be beneficial if customized mediations based on individual needs could be developed and implemented, taking into consideration subgroup characteristics instead of the collective metabolic syndrome risk factors. Therefore, rather than considering individuals with metabolic syndrome risk factors as a homogenous group and applying the developed interventions collectively, customized interventions should be developed considering the characteristics of each subgroup, and groups that share the same characteristics should be efficiently classified. Such interventions can be made much more effective if they incorporate strategies targeting each of the various risk factors for metabolic syndrome.

Limitations

This study has several limitations. First, the NHANES questionnaire we used could not incorporate various variables. Due to annual changes in the survey questions, data were extracted that matched all 10 years of the survey questions. Second, as the object of investigation differed every year, tracking the longitudinal changes and progress of metabolic syndrome was a challenge. Third, although various machine learning techniques were used in this study, the most commonly used artificial neural network technique was not used. In the future, it will be necessary to conduct research applying deep learning methods such as artificial neural networks.

Conclusions

This study is significant in that it is the first to use latent stratification analysis and machine learning techniques to identify the types and characteristics of potential subgroups classified based on potential metabolic syndrome risk factor indicators in adult single-person households. This study conducted a secondary analysis of data (2009-2018) from the NHANES hosted by the Korea Centers for Disease Control and Prevention, through which it classified and characterized risk factors for metabolic syndrome in adult single-person households.

In this study, machine learning techniques were applied to identify factors affecting metabolic syndrome in adult single-person households, which were identified as high parameters. In addition, the groups classified based on risk factors for metabolic syndrome in adult single-person households using LCA were intense physical activity in early adulthood, hypertension in middle-aged female respondents, smoking and drinking in middle-aged male respondents, and obesity and abdominal obesity in middle-aged male respondents. In addition, when confirming the difference between potential class groups according to the factors influencing metabolic syndrome, the 4 potential classes showed substantial differences in general characteristics such as education level, income level, frequency of dining out, dietary life, subjective health status, and subjective body shape recognition. In addition, when examining the prediction of the occurrence of metabolic syndrome for each group, it was found that the obesity and abdominal obesity in middle-aged male respondents group had the highest probability, indicating that it was the most susceptible high-risk group in terms of the occurrence of metabolic syndrome.

This study is meaningful as a new attempt to identify the factors influencing metabolic syndrome in adult single-person households by applying machine learning techniques, categorize risk factors for metabolic syndrome using LCA, and identify the characteristics of each latent class. Therefore, this study provides new knowledge and contributes to the prevention of metabolic syndrome in adult single-person households by identifying 4 latent classes through LCA and thus facilitating the development of customized interventions.

Acknowledgments

This study was supported by a National Research Foundation of Korea grant funded by the Korean government (Ministry of Science and ICT; 2021R1A2C2095271).

Data Availability

The data sets generated and analyzed during this study are available from the corresponding author upon reasonable request.

Authors' Contributions

All authors had full access to all the data and take responsibility for the integrity of the data and accuracy of the data analysis. JSL and SKL contributed to the conceptualization and design of the study. JSL and SKL contributed to the acquisition and statistical analysis of the data. JSL and SKL contributed to the interpretation of the data and drafting of the manuscript. SKL critically revised the manuscript for important intellectual content, obtained funding, and supervised the study. SKL provided administrative, technical, and material support.

Conflicts of Interest

None declared.

  1. Zheng X, Yu H, Qiu X, Chair SY, Wong EM, Wang Q. The effects of a nurse-led lifestyle intervention program on cardiovascular risk, self-efficacy and health promoting behaviours among patients with metabolic syndrome: randomized controlled trial. Int J Nurs Stud. Sep 2020;109:103638. [FREE Full text] [CrossRef] [Medline]
  2. Fujiki H. The use of noncash payment methods for regular payments and the household demand for cash: evidence from Japan. Jpn Econ Rev. Jul 31, 2020;71(4):719-765. [CrossRef]
  3. Lee HN. The social service needs of single-person households and their policy implications. Health Welfare Pol Forum. Oct 01, 2020;288:21-35. [FREE Full text]
  4. Kawano T, Moriki G, Bono S, Kaji N, Jung H. Effects of household composition on health-related quality of life among the japanese middle-aged and elderly: analysis from a gender perspective. Jpn J Soc Welfare. 2020;60(5):1-12. [FREE Full text] [CrossRef]
  5. Kang SH, Park JY. Factors affecting the life satisfaction of unmarried one-person households according to marital experience. J Fam Res Manag Policy Rev. 2020;24:21-39. [CrossRef]
  6. Rantanen AT, Korkeila JJ, Kautiainen H, Korhonen PE. Non-melancholic depressive symptoms increase risk for incident cardiovascular disease: a prospective study in a primary care population at risk for cardiovascular disease and type 2 diabetes. J Psychosom Res. Feb 2020;129:109887. [CrossRef] [Medline]
  7. Ahn JH, Park YK. Frequency of eating alone and health related outcomes in Korean adults: based on the 2016 Korea National Health and Nutrition Examination Survey. J Korean Diet Assoc. 2020;26(2):85-100. [CrossRef]
  8. Mi-ree B, Min B, Minjin P. An empirical analysis of the spatial distribution and flow patterns of Seoul’s single-person households. J Popul Stud. 2019;42:91-119. [CrossRef]
  9. Lee YB. One-person households and their policy implications. Health Welf Policy Forum. 2017;252(1):64-77.
  10. Smiley A, King D, Bidulescu A. The association between sleep duration and metabolic syndrome: the NHANES 2013/2014. Nutrients. Oct 26, 2019;11(11):2582. [FREE Full text] [CrossRef] [Medline]
  11. Rogers JM. Smoking and pregnancy: epigenetics and developmental origins of the metabolic syndrome. Birth Defects Res. Oct 15, 2019;111(17):1259-1269. [FREE Full text] [CrossRef] [Medline]
  12. Gossett LK, Johnson HM, Piper ME, Fiore MC, Baker TB, Stein JH. Smoking intensity and lipoprotein abnormalities in active smokers. J Clin Lipidol. Dec 2009;3(6):372-378. [FREE Full text] [CrossRef] [Medline]
  13. Shin M. Comparative study on health behavior and mental health between one person and multi-person households : analysis of data from the national health and nutrition examination surveys(2013, 2015, 2017). J Korean Soc Wellness. Nov 30, 2019;14:11-23. [CrossRef]
  14. Detournay B, Fagnani F, Phillippo M, Pribil C, Charles MA, Sermet C, et al. Obesity morbidity and health care costs in France: an analysis of the 1991-1992 Medical Care Household Survey. Int J Obes Relat Metab Disord. Feb 3, 2000;24(2):151-155. [CrossRef] [Medline]
  15. Lin CH, Chiang SL, Heitkemper MM, Hung YJ, Lee MS, Tzeng WC, et al. Effects of telephone-based motivational interviewing in lifestyle modification program on reducing metabolic risks in middle-aged and older women with metabolic syndrome: a randomized controlled trial. Int J Nurs Stud. Aug 2016;60:12-23. [CrossRef] [Medline]
  16. Kurl S, Laukkanen JA, Niskanen L, Laaksonen D, Sivenius J, Nyyssönen K, et al. Metabolic syndrome and the risk of stroke in middle-aged men. Stroke. Mar 2006;37(3):806-811. [CrossRef] [Medline]
  17. Kim MH, Lee SH, Shin KS, Son DY, Kim SH, Joe H, et al. The change of metabolic syndrome prevalence and its risk factors in Korean adults for decade: Korea national health and nutrition examination survey for 2008–2017. Korean J Fam Pract. Feb 20, 2020;10(1):44-52. [CrossRef]
  18. Harrison S, Couture P, Lamarche B. Diet quality, saturated fat and metabolic syndrome. Nutrients. Oct 22, 2020;12(11):3232. [FREE Full text] [CrossRef] [Medline]
  19. Akhlaghi M. Dietary Approaches to Stop Hypertension (DASH): potential mechanisms of action against risk factors of the metabolic syndrome. Nutr Res Rev. Jul 30, 2019;33(1):1-18. [CrossRef]
  20. Bansal R, Gubbi S, Muniyappa R. Metabolic syndrome and COVID 19: endocrine-immune-vascular interactions shapes clinical course. Endocrinology. Oct 01, 2020;161(10):bqaa112. [FREE Full text] [CrossRef] [Medline]
  21. Ezzati M, Hoorn SV, Rodgers A, Lopez AD, Mathers CD, Murray CJ, et al. Comparative Risk Assessment Collaborating Group. Estimates of global and regional potential health gains from reducing multiple major risk factors. Lancet. Jul 26, 2003;362(9380):271-280. [CrossRef] [Medline]
  22. Muthen B, Muthen LK. Integrating person-centered and variable-centered analyses: growth mixture modeling with latent trajectory classes. Alcohol Clin Exp Res. Jun 2000;24(6):882-891. [CrossRef]
  23. Collins LM, Wugalter SE. Latent class models for stage-sequential dynamic latent variables. Multivar Behav Res. Jan 1992;27(1):131-157. [CrossRef]
  24. Nylund K, Bellmore A, Nishina A, Graham S. Subtypes, severity, and structural stability of peer victimization: what does latent class analysis say? Child Dev. Nov 2007;78(6):1706-1722. [CrossRef] [Medline]
  25. Kaptein KI, de Jonge P, van den Brink RH, Korf J. Course of depressive symptoms after myocardial infarction and cardiac prognosis: a latent class analysis. Psychosom Med. 2006;68(5):662-668. [CrossRef] [Medline]
  26. Locklear LR, Taylor SG, Ambrose ML. How a gratitude intervention influences workplace mistreatment: a multiple mediation model. J Appl Psychol. Sep 17, 2020;106(9):1314-1331. [CrossRef] [Medline]
  27. Alyass A, Turcotte M, Meyre D. From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med Genomics. Jun 27, 2015;8(1):33. [FREE Full text] [CrossRef] [Medline]
  28. Wang J, Chen H, Wang H, Liu W, Peng D, Zhao Q, et al. A risk prediction model for physical restraints among older Chinese adults in long-term care facilities: machine learning study. J Med Internet Res. Apr 06, 2023;25:e43815. [FREE Full text] [CrossRef] [Medline]
  29. Dong Y, Yeo MC, Tham XC, Danuaji R, Nguyen TH, Sharma AK, et al. Investigating psychological differences between nurses and other health care workers from the Asia-Pacific region during the early phase of COVID-19: machine learning approach. JMIR Nurs. Jun 01, 2022;5(1):e32647. [FREE Full text] [CrossRef] [Medline]
  30. Xie W, Ji M, Liu Y, Hao T, Chow CY. Predicting writing styles of web-based materials for children's health education using the selection of semantic features: machine learning approach. JMIR Med Inform. Jul 22, 2021;9(7):e30115. [FREE Full text] [CrossRef] [Medline]
  31. Expert Panel on Detection‚ Evaluation‚Treatment of High Blood Cholesterol in Adults. Executive summary of the third report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (Adult Treatment Panel III). JAMA. May 16, 2001;285(19):2486-2497. [CrossRef] [Medline]
  32. Kim WJ, Lee SY. A latent class analysis and predictors of chronic diseases -based on 2014 Korea National Health and Nutrition Examination Survey. J Korea Acad Ind Coop Soc. Jun 30, 2018;19(6):324-333. [CrossRef]
  33. West SL, Bates H, Watson J, Brenner IK. Discriminating metabolic health status in a cohort of nursing students: protocol for a cross-sectional study. JMIR Res Protoc. Aug 28, 2020;9(8):e21342. [FREE Full text] [CrossRef] [Medline]
  34. Tylutka A, Morawin B, Walas Ł, Michałek M, Gwara A, Zembron-Lacny A. Assessment of metabolic syndrome predictors in relation to inflammation and visceral fat tissue in older adults. Sci Rep. Jan 03, 2023;13(1):89. [FREE Full text] [CrossRef] [Medline]
  35. Andica C, Kamagata K, Takabayashi K, Kikuta J, Kaga H, Someya Y, et al. Neuroimaging findings related to glymphatic system alterations in older adults with metabolic syndrome. Neurobiol Dis. Feb 2023;177:105990. [FREE Full text] [CrossRef] [Medline]
  36. Ntougou Assoumou HG, Pichot V, Barthelemy JC, Celle S, Garcin A, Thomas T, et al. Obesity related to metabolic syndrome: comparison of obesity indicators in an older french population. Diabetol Metab Syndr. May 11, 2023;15(1):98. [FREE Full text] [CrossRef] [Medline]
  37. Lakka HM, Laaksonen DE, Lakka TA, Niskanen LK, Kumpusalo E, Tuomilehto J, et al. The metabolic syndrome and total and cardiovascular disease mortality in middle-aged men. JAMA. Dec 04, 2002;288(21):2709-2716. [CrossRef] [Medline]
  38. Amadou C, Heude B, de Lauzon-Guillain B, Lioret S, Descarpentrie A, Ribet C, et al. Early origins of metabolic and overall health in young adults: an outcome-wide analysis in a general cohort population. Diabetes Metab. Mar 2023;49(2):101414. [CrossRef] [Medline]
  39. Lee JS, Kang MA, Lee SK. Effects of the e-Motivate4Change program on metabolic syndrome in young adults using health apps and wearable devices: quasi-experimental study. J Med Internet Res. Jul 30, 2020;22(7):e17031. [FREE Full text] [CrossRef] [Medline]
  40. Lee SK, Moon M. Factors influencing metabolic syndrome perception and exercising behaviors in Korean adults: data mining approach. J Korea Acad Ind Coop Soc. Dec 31, 2017;18(12):581-588. [CrossRef]
  41. Kang JS, Kang HS, Jeong Y. A web-based health promotion program for patients with metabolic syndrome. Asian Nurs Res (Korean Soc Nurs Sci). Mar 2014;8(1):82-89. [FREE Full text] [CrossRef] [Medline]
  42. Petrella RJ, Stuckey MI, Shapiro S, Gill DP. Mobile health, exercise and metabolic risk: a randomized controlled trial. BMC Public Health. Oct 18, 2014;14:1082. [FREE Full text] [CrossRef] [Medline]
  43. Chen SM, Creedy D, Lin HS, Wollin J. Effects of motivational interviewing intervention on self-management, psychological and glycemic outcomes in type 2 diabetes: a randomized controlled trial. Int J Nurs Stud. Jun 2012;49(6):637-644. [CrossRef] [Medline]
  44. Wang HJ, Shi LZ, Liu CF, Liu SM, Shi ST. Association between uric acid and metabolic syndrome in elderly women. Open Med (Wars). 2018;13:172-177. [FREE Full text] [CrossRef] [Medline]
  45. Jung CH, Park JS, Lee WY, Kim SW. Effects of smoking, alcohol, exercise, level of education, and family history on the metabolic syndrome in Korean adults. Korean J Med. Dec 2002;63(6):649-659. [FREE Full text]
  46. Oh JE. Association between smoking status and metabolic syndrome in men. Korean J Obes. 2014;23(2):99-105. [CrossRef]
  47. Zhang B, Pan B, Zhao X, Fu Y, Li X, Yang A, et al. The interaction effects of smoking and polycyclic aromatic hydrocarbons exposure on the prevalence of metabolic syndrome in coke oven workers. Chemosphere. May 2020;247:125880. [CrossRef] [Medline]


DT: decision tree
LCA: latent class analysis
LR: logistic regression
NHANES: National Health and Nutrition Examination Survey
RF: random forest
XGBoost: extreme gradient boost


Edited by A Mavragani; submitted 16.09.22; peer-reviewed by S Sharma, K Gupta, F Gomez; comments to author 05.04.23; revised version received 24.04.23; accepted 17.08.23; published 12.09.23.

Copyright

©Ji-Soo Lee, Soo-Kyoung Lee. Originally published in JMIR Formative Research (https://formative.jmir.org), 12.09.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.