The Relationship Between a History of High-risk and Destructive Behaviors and COVID-19 Infection: Preliminary Study

Background The COVID-19 pandemic has heightened mental health concerns, but the temporal relationship between mental health conditions and SARS-CoV-2 infection has not yet been investigated. Specifically, psychological issues, violent behaviors, and substance use were reported more during the COVID-19 pandemic than before the pandemic. However, it is unknown whether a prepandemic history of these conditions increases an individual’s susceptibility to SARS-CoV-2. Objective This study aimed to better understand the psychological risks underlying COVID-19, as it is important to investigate how destructive and risky behaviors may increase a person’s susceptibility to COVID-19. Methods In this study, we analyzed data from a survey of 366 adults across the United States (aged 18 to 70 years); this survey was administered between February and March of 2021. The participants were asked to complete the Global Appraisal of Individual Needs–Short Screener (GAIN-SS) questionnaire, which indicates an individual’s history of high-risk and destructive behaviors and likelihood of meeting diagnostic criteria. The GAIN-SS includes 7 questions related to externalizing behaviors, 8 related to substance use, and 5 related to crime and violence; responses were given on a temporal scale. The participants were also asked whether they ever tested positive for COVID-19 and whether they ever received a clinical diagnosis of COVID-19. GAIN-SS responses were compared between those who reported and those who did not report COVID-19 to determine if those who reported COVID-19 also reported GAIN-SS behaviors (Wilcoxon rank sum test, α=.05). In total, 3 hypotheses surrounding the temporal relationships between the recency of GAIN-SS behaviors and COVID-19 infection were tested using proportion tests (α=.05). GAIN-SS behaviors that significantly differed (proportion tests, α=.05) between COVID-19 responses were included as independent variables in multivariable logistic regression models with iterative downsampling. This was performed to assess how well a history of GAIN-SS behaviors statistically discriminated between those who reported and those who did not report COVID-19. Results Those who reported COVID-19 more frequently indicated past GAIN-SS behaviors (Q<0.05). Furthermore, the proportion of those who reported COVID-19 was higher (Q<0.05) among those who reported a history of GAIN-SS behaviors; specifically, gambling and selling drugs were common across the 3 proportion tests. Multivariable logistic regression revealed that GAIN-SS behaviors, particularly gambling, selling drugs, and attention problems, accurately modeled self-reported COVID-19, with model accuracies ranging from 77.42% to 99.55%. That is, those who exhibited destructive and high-risk behaviors before and during the pandemic could be discriminated from those who did not exhibit these behaviors when modeling self-reported COVID-19. Conclusions This preliminary study provides insights into how a history of destructive and risky behaviors influences infection susceptibility, offering possible explanations for why some persons may be more susceptible to COVID-19, potentially in relation to reduced adherence to prevention guidelines or not seeking vaccination.


Introduction Background
The SARS-CoV-2 pandemic has led to a concern about behavioral alterations in both those with COVID-19 and those dealing with pandemic-related stresses [1]. SARS-CoV-2 infection has been shown to cause COVID-19 morbidity and mortality with symptoms ranging from severe respiratory distress to prolonged cognitive dysfunction (eg, brain fog) and mental health (MH) problems [2][3][4]. In the United States, rates of MH conditions, drug overdoses, and violence-related emergency department visits were higher during the pandemic than during the previous year (2019) [5]. Reports also suggest an increase in high-risk behaviors, such as problematic web-based gaming [6], crime and violence [7], and worsening of externalizing MH symptoms, such as reduced concentration [8]. Despite pandemic-related increases in MH disorders and risk-taking behaviors, the temporal relationship between them and SARS-CoV-2 infection remains unclear. That is, did these behaviors influence a person's infection susceptibility or were these behaviors more common in those infected?
A study by Wang et al [9] found that those with recent substance use disorder (SUD) diagnoses were at a higher risk for COVID-19, especially those abusing opioids. However, the relationships between COVID-19 infection and previous SUD-related behavioral problems, among other destructive behaviors, remain largely unknown. Previous research has linked deviant [10] and antisocial behaviors [11][12][13], aggression [10], isolation [14], and alcohol and substance use [12,13,15,16] to greater infection susceptibility across an array of infectious diseases, including swine flu, HIV, and other sexually transmitted diseases. Antisocial behaviors have also been linked to less social distancing and more social outings during the COVID-19 pandemic [17]. Because these destructive-type behaviors have been previously linked to greater infection susceptibility, we sought to study whether they would also be linked to COVID-19 infection.
The Global Appraisal of Individual Needs-Short Screener (GAIN-SS) questionnaire (Chestnut Health Systems, Bloomington, IL) has been validated to pinpoint diagnostic criteria for (1) externalizing and internalizing MH disorders, (2) substance abuse (including alcohol abuse), and (3) crime and violence in both adolescents and adults [18][19][20][21][22][23]. Questions ask for the recency of specific behaviors such as lying and gambling (externalizing behaviors), using alcohol or drugs (substance use), and selling drugs or destroying property (crime and violence). Clinically, the self-administrable GAIN-SS survey is used to screen for behavioral health disorders that would warrant more in-depth assessment or intervention. The efficacy of the GAIN-SS survey in identifying populations at risk for SUDs [24] and co-occurring substance use and MH disorders [25][26][27] has been validated.
In the context of COVID-19, the GAIN-SS survey has been used to investigate behavioral differences between students and nonstudents [28]. Findings demonstrated that MH issues were stable but substance use declined in youths during the pandemic. Another study found that youths in clinical settings met higher diagnostic criteria for externalizing and internalizing disorders during the pandemic than youths in the community [29]. However, research on COVID-19-related GAIN-SS behaviors across the adult population is lacking. Furthermore, the temporal relationship between the recency of GAIN-SS behaviors and subsequent SARS-CoV-2 infection remains unknown.

Goal of This Study
In this preliminary cross-sectional study, we investigated how past GAIN-SS behaviors were related to SARS-CoV-2 infection. We tested the central hypothesis that self-reported histories of high-risk behaviors, MH disorders, and substance use issues would relate to, and predict, SARS-CoV-2 infection. This central hypothesis was structured into 3 subhypotheses, and proportion tests were used to investigate the temporality between these destructive behaviors and SARS-CoV-2 infection: (1) those with any history of destructive behaviors (from 1 month to >1 year ago) would have a higher proportion of positive COVID-19 tests or diagnoses when compared with those with no history of destructive behaviors (ie, "never"), (2) those reporting destructive behaviors before the pandemic (ie, >1 year ago) would have a higher proportion of positive COVID-19 tests or diagnoses when compared with those with no history of destructive behaviors (ie, "never"), and (3) those reporting destructive behaviors before the pandemic would have a higher proportion of positive COVID-19 tests or diagnoses when compared with all other response types (ie, between 1 month and 1 year ago and "never"; see the Statistical Analysis section under Methods for details). To further test these subhypotheses, we applied multivariable logistic regression (MVLR) with iterative downsampling to investigate the efficacy of GAIN-SS behaviors in discriminating participants with and without self-reported COVID-19. Together, the presented results implicate high-risk and destructive behaviors in SARS-CoV-2 infection and suggest that increased public messaging (eg, enforcing mask wearing) at entertainment venues, clinics, and rehabilitation centers, in addition to clinical and rehabilitation-related behavioral interventions, may be important when managing similar pandemics.

Participant Recruitment
The participant recruitment procedure was first detailed in Bari et al [30]. Questionnaire responses were collected between the end of February 2021 and the beginning of March 2021, approximately 1 year following the official COVID-19 pandemic declaration in the United States (March 11, 2020) [31]. Participants between the ages of 18 and 70 years were recruited by Gold Research Inc (San Antonio, Texas) using multiple methods such as (1) by invitation only using customer databases from large companies that participate in revenue-sharing agreements, (2) via social media, or (3) through direct mail. All participants were reimbursed US $10 for their participation. Recruited respondents followed a double opt-in consent procedure to participate in the study (refer to Ethics Approval); during this process, they also provided information about demographic attributes, including age, race, and sex. This information was used to ensure that the recruited participants represented the US census at the time of the survey (February-March 2021). During the study, the respondents were also prompted with repeated test questions to screen out those providing random and illogical responses and those showing flatline or speeder behavior. Data from those flagged as nonadherers were removed. To ensure adequate samples of participants with MH conditions, Gold Research oversampled 15% (7500/50,000) of the sample for MH conditions. Gold Research reported that >50,000 respondents were contacted to complete the questionnaire. Gold Research estimated that of the 50,000 participants, >37,500 (>75%) either did not respond or declined participation. Of the remaining 25% (12,500/50,000) who clicked on the survey link, >50% (>6250/12,500) did not fully complete the questionnaire. Of the >48% (≥6000/12,500) of participants who completed the survey, those who did not clear data integrity assessments were omitted. The participants meeting quality assurance procedures (including survey completion) were selected, with a limit of 500 to 520 total participants. Eligible participants were required to be between the ages of 18 and 70 years at the time of the survey, to be able to comprehend the English language, and to have access to an electronic device (eg, laptop or cell phone). The participants provided informed consent as described in Ethics Approval.

Ethics Approval
All the participants provided informed consent, including for their primary participation in the study and the secondary use of their anonymized, deidentified data (ie, all identifying information was removed by Gold Research Inc before retrieval by the research group) in secondary analyses (refer to the consent prompts given in the next paragraph). The study was approved by Northwestern University's institutional review board and was in accordance with the Declaration of Helsinki (approval number: STU00213665).
During initial recruitment, the participants were presented with the following: Gold Research Inc., a national market research firm and its client, Northwestern University, request your participation in this study of emotional health. We will be evaluating how different emotions and experiences are connected and may relate to our emotional health. The information you provide will be kept confidential, coded to be anonymous so it cannot be connected back to you and will be used only for research purposes. Researchers will not be able to contact you or restudy you after this survey. We will not share your information with any other third party. We will also not use your information to identify you individually or use your responses to market or sell other services or products to you. As part of this effort, you will not be asked to provide any personal identifiers such as your name, email, phone number, address, or social media handles. A unique identifier will be generated for you and each survey participant to enhance privacy. As part of the survey process, we will be able to tell if you completed the survey, but we will not be able to tell which answers were yours. For this study, we are going to ask you some questions about yourself and how much you like or dislike a set of pictures. You may discontinue this study at any time. We appreciate your help with this study, given the serious challenges facing many people regarding emotional health at this time. We thank you in advance.

Decline
If the participants responded with "Accept," they were sent a second communication: Thank you for participating in our survey. All responses during this survey are anonymous and confidential. We will be able to tell if you completed the survey, but we will not be able to tell which answers were yours. In this study, we aim to understand how different emotions and experiences relate to visual processing.
We are going to: *Ask you some questions about yourself *Have you rate how much you like or dislike a set of pictures For this study, your identity is protected and your answers are anonymous and confidential. Press "Next" to proceed.
The survey then commenced if the participants pressed "Next."

Data Quality Assurance
Data from 506 participants (age: median 44, IQR 30-59 years) passed Gold Research's integrity assessment (refer to Participant Recruitment) and were then anonymized and sent to the research team. The data were further checked for quality and assessed against three exclusion criteria: (1) participants showed minimal variance in a picture rating task (ie, all pictures were rated the same, or ratings varied only by 1 point; resulted in the removal of 16/506, 3.2% participants [data not described here]); (2) participants indicated they had ≥10 clinician-diagnosed conditions (resulted in the removal of an additional 118/506, 23.3% participants; conditions described in Figure S1 in Multimedia Appendix 1), and (3) if both education level and years of education did not match and if they completed the questionnaire in <500 seconds (resulted in removal of an additional 6/506, 1.2% participants). From these procedures, 72.3% (366/506) of participants were cleared for statistical analysis (the unscored, uncoded data set can be found in Multimedia Appendix 2).

Sample Size Calculation
At the time of the survey, 10% of the participants were expected to report having had a positive COVID-19 test (referred to as test+) or a positive COVID-19 diagnosis (referred to as diagnosis). Formal power analysis for a 2-sample proportion test revealed an estimated power of 0.986 when comparing the group that responded "yes" with the group that responded "no" to test+ (test+ sample size=36, no test+ sample size=330; α=.05, hypothetical proportion of test+ sample=0.8, hypothetical proportion of no test+ sample=0.5) and a power of 0.982 when comparing the group that responded "yes" with the group that responded "no" to diagnosis (diagnosis sample size=34, no diagnosis sample size=332; α=.05, hypothetical proportion of diagnosis sample=0.8, hypothetical proportion of no diagnosis sample=0.5).

The Questionnaire
The participants were asked to report their age, gender, ethnicity, handedness, annual household income, employment status, level of education, and years of schooling (Table S1 in Multimedia Appendix 1). They were asked to report whether they ever tested positive for COVID-19 ("yes" or "no"; test+) and whether they were ever diagnosed with COVID-19 by a medical professional ("yes" or "no"; diagnosis). The participants were 57.9% (212/366) female, 66.9% (245/366) White, 81.9% (300/366) right-handed, and 42.6% (156/366) employed full-time, and 28.7% (105/366) reported some level of college education (mean years of school 13; Table S1 in Multimedia Appendix 1), approximating national averages for these measures at the time of the survey. Of the 366 participants, 36 (9.8%) reported "yes" to test+ and 34 (9.3%) reported "yes" to diagnosis, resembling national averages reported by the Centers for Disease and Control at the time of the survey. A total of 7.1% (26/366) of participants reported "yes" to both test+ and diagnosis.
The participants also completed the GAIN-SS questionnaire (described in the GAIN-SS Questionnaire and Scoring section) [18].

GAIN-SS Questionnaire and Scoring
The GAIN-SS questionnaire takes 3 to 5 minutes to complete and is designed to flag MH problems qualifying as (1) externalizing (eg, bullying and gambling) and internalizing (eg, fear and depression) [32], (2) substance abuse, and (3) crime and violence. We limited the GAIN-SS questionnaire to include 3 of the 4 categories to shorten the length of the overall survey: externalizing MH disorders (7 questions), substance abuse disorders (8 questions), and crime and violence problems (5 questions). All 20 questions, their abbreviated forms used hereafter, and their respective categories are outlined in Table  1.
The GAIN-SS question responses follow 2 formats. One format (ie, for externalizing behaviors) assesses whether the individual never experienced the behavior ("0") or experienced ≥2 events over 1 of 4 time blocks: "1"=experienced the behavior >1 year ago, "2"=experienced the behavior 4 to 12 months ago, "3"=experienced the behavior 2 to 3 months ago, and "4"=experienced the behavior in the past month. The other format (ie, for substance abuse and crime and violence) asks when the participants last experienced a behavior using the same time blocks (refer to "0-4" in the previous sentence). Scores were obtained for each of the 3 questionnaire categories by counting the number of times the participants responded with a "2," "3," or "4" for all questions in a category; responses of "0" and "1" were not included in the count. For example, a participant with four "0" responses, one "1" response, and two "3" responses would have a final score of 2 (ie, only the two "3" responses were counted). A final externalizing score of 0 would indicate that the participant is unlikely to have a diagnosis, a score of 1 to 2 indicates a moderate likelihood of diagnosis, and a score of ≥3 indicates a high likelihood of diagnosis. Lied or conned to get things you wanted or to avoid having to do something When was the last time that you did the following things two or more times?
Had a hard time paying attention at school, work, or home When was the last time that you did the following things two or more times?

Externalizing Attention Q2
Had a hard time listening to instructions at school, work, or home When was the last time that you did the following things two or more times?
Externalizing Listening Q3 Had a hard time waiting for your turn When was the last time that you did the following things two or more times?
Were a bully or threatened other people When was the last time that you did the following things two or more times?

Analysis of Demographics and GAIN-SS by Self-reported COVID-19
Demographic variables (Table S1 in Multimedia Appendix 1), GAIN-SS scores, and individual GAIN-SS question responses were assessed for differences between those who responded "yes" and those who responded "no" to test+, diagnosis, or both using the Wilcoxon rank sum test [33]. Significant categorical demographic variables were further assessed for distribution equality using the Kolmogorov-Smirnov test (α=.05) [34]. Results with significant P values (α=.05) were corrected for multiple comparisons using the Benjamini-Hochberg procedure (reported as Q values) [35]. Box plots were generated for significant results (Q value<0.05).

GAIN-SS Proportion Tests by Self-reported COVID-19
The first subhypothesis included all GAIN-SS responses ("0-4"). It tested whether the participants who exhibited GAIN-SS behaviors at any prior time (past month, 2 to 3 months ago, 4 to 12 months ago, and >1 year ago) were more likely to respond "yes" to having had a positive COVID-19 test (test+) or diagnosis than the participants who responded "no" to test+ or diagnosis. The response to "never" was coded as 0 and all other responses (past month, 2 to 3 months, 4 to 12 months, and >1 years) were coded as 1. This subhypothesis is referred to as never vs. anytime henceforth.
The second subhypothesis excluded the participants who exhibited GAIN-SS behaviors more recently (ie, past month to 12 months ago). To investigate whether the participants who exhibited GAIN-SS-related behaviors >1 year ago (ie, before the pandemic) were more likely to report "yes" for COVID-19, "never" was coded as 0, and "1+ years ago" was coded as 1. This subhypothesis is referred to as never vs. >1 year henceforth.
The third subhypothesis included the participants who exhibited GAIN-SS behaviors more recently (ie, past month to 12 months ago). It tested whether the participants who exhibited GAIN-SS-related behaviors >1 year ago were more likely to report "yes" for COVID-19. For this case, "never" and more recent responses (ie, past month, 2 to 3 months, and 4 to 12 months) were coded as 0, and "1+ years ago" was coded as 1. This subhypothesis is referred to as anytime and never vs. >1 year henceforth.
For each subhypothesis, proportion tests were performed to obtain both nondirectional and directional P values (α=.05). For the case where the participants who exhibited GAIN-SS behaviors had a higher proportion of "yes" responses to test+ or diagnosis (P [YES>NO]), a Benjamini-Hochberg correction was performed to obtain Q values (Q [YES>NO]). GAIN-SS questions implicated with higher proportions of "yes" responses to test+ and diagnosis (Q [YES>NO] <0.05) were further analyzed using MVLR. Refer to Multimedia Appendix 1 for a more complete description of all 3 subhypotheses.

MVLR and Iterative Downsampling: Using GAIN-SS to Predict COVID-19
MVLR was performed for each of the 3 subhypotheses using demographic variables that significantly differed between those who responded "yes" and those who responded "no" to test+ and diagnosis, as well as significant GAIN-SS behaviors as determined from proportion tests (refer to the previous section). Self-reported COVID-19 (where "yes"=1 and "no"=0) was a binary dependent variable, demographics were covariates (age, income, and education level), and GAIN-SS responses (ie, responses that significantly differed in each corresponding proportion test) were the independent variables. Because more recent time point responses were dropped when testing never vs. >1 year, resulting in different response distributions to "1+ years ago" for each question, GAIN-SS behaviors were analyzed independently for this subhypothesis.
Because the percentage of participants who responded "yes" to test+, diagnosis, or both (33-37/366, 9%-10%) was much smaller than the percentage of those who responded "no" (329-333/366, 89.9%-91%), the following procedures were performed to avoid overfitting the MVLR models to the majority class (ie, participants who responded "no" to test+ and diagnosis). Data from the majority class were randomly downsampled 1000 times to match the sample size of those who self-reported COVID-19 ("yes": 36/366, 9.8% for test+ model and 34/366, 9.3% for diagnosis model). Downsampling was iterated 1000 times to obtain better estimates of the measures reported. MVLR was run at each iteration for the downsampled data to obtain model accuracy, root mean square error, mean absolute error, area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Average measures across all iterations were reported. Crude and adjusted odds ratios with respective logit estimates, z scores, P values, and SEs from 1 iteration were also reported for representative models. Please note that, given the sample size of the minority class, MVLR was run without separate training and test sets. The accuracy was computed by dividing the number of times the model correctly determined the binary outcome by the size of the downsampled data.

Demographic Variables Varied by Self-reported COVID-19
Age and income significantly varied by test+, whereas age, income, and education level significantly varied by diagnosis (Q=0.024 for test+ and Q=0.020 for diagnosis; Figure 1, Table  3; all results reported in Table S2 in Multimedia Appendix 1). The participants who responded "yes" to test+ and diagnosis were, on average, younger than those who responded "no" ( Figure 1A). Specifically, middle-aged adults more frequently reported "yes" to test+ (median 37, IQR 25-47 years) and diagnosis (median 37.5, IQR 25-45 years), as compared with those who responded "no" (median 45, IQR 31-59 years). The participants who responded "no" to test+ or diagnosis fell on a left-skewed distribution, implying a higher percentage of low self-reported annual household income as compared with the participants who responded "yes" (Wilcoxon rank sum test Q=0.02 for test+ and Q<0.001 for diagnosis; Figure 1B). Those who responded "yes" to test+ or diagnosis exhibited a bimodal distribution of education level, whereas the distribution for "no" responses was skewed left (Wilcoxon rank sum test P=.02; Q=0.08 for test+ and Q=0.04 for diagnosis; Figure 1C). The Kolmogorov-Smirnov test confirmed the differences between the "yes" and "no" distributions ( Figure 1D). Income distributions were different between those who responded "yes" and those who responded "no" to test+ (P=.004) and diagnosis (P=.001); specifically, more persons responded "yes" to COVID-19 in the higher income categories. Education level distributions were different between those who responded "yes" and those who responded "no" to test+ (P=.04) and diagnosis (P=.01); specifically, more persons with higher reported education responded "yes" to COVID-19.   Table 3 and Figure S2 in Multimedia Appendix 1). Responses to a larger set of questions (18 out of 20 total questions) varied by diagnosis (refer to Table  3 for all Q<0.05; Figure S2 in Multimedia Appendix 1). The complete set of results are reported in Table S3 in Multimedia Appendix 1.

Participants Who Exhibited GAIN-SS Behaviors Reported Higher Proportions of COVID-19
Testing the 3 subhypotheses produced multiple outcomes. The coding procedure for each respective subhypothesis can be found in Table 4.
For never vs anytime, there were 23 significant proportion test results (refer to Table 5 for all Q<0.05), where participants who exhibited GAIN-SS behaviors at any prior time (ie, past month, 2 to 3 months ago, 4 to 12 months ago, and >1 year ago) had a higher proportion of "yes" responses than "no" responses to test+ and diagnosis compared with those responding "no" (see Table 5 for all Q [YES>NO]). Most results (17/23, 74%) involved diagnosis.
For never vs. >1 year, there were 8 significant results (refer to Table 6 for all Q<0.05), where participants who exhibited GAIN-SS behaviors >1 year ago had a higher proportion of "yes" responses than "no" responses to test+ and diagnosis compared with those responding "no" (refer to  Table 7 for all Q<0.05), where the participants who exhibited GAIN-SS behaviors >1 year ago had a higher proportion of "yes" responses than "no" responses to test+ and diagnosis when compared with those responding "no" (refer to Table 7 for all Q [YES>NO]). Table 4. The coding criteria for each proportion test hypothesis: (1) never versus anytime, (2) never versus >1 year ago, and (3) anytime and never versus >1 year ago. "Yes" responses to test+ and diagnosis were coded as 0, and "no" responses to test+ and diagnosis were coded as 1.

A Subset of GAIN-SS Behaviors Predicted Self-reported COVID-19
MVLR tested the efficacy of using GAIN-SS behaviors to predict "yes/no" responses to test+ and diagnosis. MVLR models were run using significant GAIN-SS behaviors from each of the 3 proportion tests (Tables 5-7). The covariates age, income, and education level were also included in each model based on Wilcoxon rank sum test results (Table 3).
Model accuracies ranged between 77.42% and 99.55%, where the model accuracy for predicting diagnosis was consistently higher. The inclusion of covariates in the model was important; however, they were not independently responsible for the high accuracies observed when GAIN-SS behaviors were also included in the model (Table S4 in Multimedia Appendix 1).
Odds ratios and related metrics can be found in Figure S3 in Multimedia Appendix 1.

Principal Findings
This study produced 3 main findings using a population sample of 366 participants (varying samples of 70 participants were included in MVLR analyses after downsampling to the minority class). First, self-reported COVID-19 was more common in younger persons with diverse income and education levels. Second, those who self-reported COVID-19 were more likely to report prior destructive and high-risk behaviors. Third, prior history of destructive and high-risk behaviors accurately modeled self-reported COVID-19. The participants who reported a history of risk-taking and destructive behaviors (in particular, gambling and drug selling) were more likely to contract SARS-CoV-2, and thus a history of risk-taking, in the absence of current risk-taking, was accurate to discriminate between participants with and participants without self-reported COVID-19. These findings support the hypothesis that prior risk-taking behaviors can predict later SARS-CoV-2 infection.

SARS-CoV-2 Infection Is More Common in Younger Persons With Diverse Incomes and Education Levels
In our sample, age, income, and education level significantly varied by self-reported COVID-19 (test+/diagnosis="yes/no"). Middle-aged adults (approximately between the ages of 25 and 45 years) more frequently reported "yes" to test+ and diagnosis compared with older adults (aged >45 years). These results contrast with some reports of SARS-CoV-2 incidence [36] but support other studies where younger to middle-aged persons were more likely to contract SARS-CoV-2 [37]. This observation could be the result of many factors, including vaccine availability, which was initially prioritized to older adults and susceptible populations up until it was more widely available in the late winter or early spring of 2021 (around the time this survey was administered) [38]. Older adults may also be more likely to follow prevention protocols (ie, mask wearing) than younger adults with more social contacts and less feelings of vulnerability [39].
Annual household income distributions varied between those who self-reported COVID-19 and those who did not. Those who responded "yes" to test+ and diagnosis displayed more Gaussian-like distributions than those who responded "no," among whom the distributions were skewed left toward lower income levels. In our sample, SARS-CoV-2 infection occurred in persons with a wide range of income levels and did not preferentially affect those with lower or higher incomes. These findings are consistent with the divergence of findings regarding income and COVID-19 incidence in the United States. Some studies reported that individuals living in higher income households (>US $75,000 annually) [37] or counties [40] had a greater probability of contracting SARS-CoV-2 in the United States, whereas other studies reported higher SARS-CoV-2 incidence and severity in lower income households [41][42][43][44][45][46].
Education level followed a bimodal-like distribution in those who responded "yes" for COVID-19, whereas the distribution skewed left for those who responded "no." These results suggest that education level did not predispose persons for SARS-CoV-2 infection, although the percentage of those with higher levels of education was greater among those who reported SARS-CoV-2 infection than among those who did not report SARS-CoV-2 infection. A report by Rattay et al [47] demonstrated that low education was associated with higher perceived COVID-19 severity and lower perceived probability of infection, albeit the differences were small and the authors iterated the importance of risk messaging to all persons, regardless of their education level. A UK study reported a higher risk for COVID-19 in those with the lowest education level [48], but another study in China reported a higher percentage of COVID-19 infection in those with a college education or higher level of education [49]. In general, the relationships between education level and COVID-19 susceptibility remain unclear, and there are many other confounding factors that may drive observations (age, income, etc).

Destructive and Risk-Taking Behaviors Were More Common in Persons Who Reported COVID-19
Both overall scores (externalizing, substance abuse, and crime and violence) and individual GAIN-SS responses differed between those who responded "yes" and those who responded "no" for COVID-19. Scores were higher and individual behaviors were more frequently reported for all time blocks (as compared with never) in those who responded "yes" for COVID-19. These results suggest that those exhibiting destructive behaviors more frequently, or more recently, may be more likely to contract SARS-CoV-2.
Among destructive and risk-taking behaviors, those who reported COVID-19 had higher proportions of gambling and drug selling behaviors across all 3 subhypotheses (Tables 5-7). These observations support multiple studies that highlight increased gaming, web-based shopping, and web-based gambling behaviors during the SARS-CoV-2 pandemic [6,50,51]. However, other reports found that gambling behaviors decreased on average; although, those with prior gambling problems reported an increase in their gambling [52,53]. In general, increases in web-based gambling were associated with COVID-19-related anxiety [54], feelings of isolation, and countering negative emotions (eg, being upset or restless) [55]. Many studies also reported increased COVID-19 risk in those with underlying SUDs [9,56,57], those with increased risky drug-seeking behaviors [58], and those with increased drug and alcohol use [59]. Our results suggest that those who gambled or sold drugs before or during the pandemic were at an increased risk for COVID-19, supporting the prior literature.
Per each subhypothesis, higher proportions of COVID-19 were observed with various other destructive behaviors. The subhypothesis never vs. anytime tested whether any history of destructive behaviors (between 1 month and >1 year ago) resulted in a higher proportion of positive COVID-19 tests or diagnoses at the time of the survey. The proportion of test+ was higher with reports of 6 destructive behaviors: attention problems, bullying, gambling, violence, selling drugs, and destruction of property. The proportion of diagnosis was higher with reported histories of 17 of the 20 destructive behaviors. That is, those who reported any temporal history of these behaviors (between 1 month and >1 year ago) had higher proportions of positive COVID-19 tests or diagnoses. However, it is difficult to discern whether the behaviors themselves influenced SARS-CoV-2 infection or whether SARS-CoV-2 infection increased the preponderance for these behaviors. Future work should include the dates of SARS-CoV-2 infection to aid interpretability, although the data would remain retrospective.
Subhypothesis never vs. >1 year tested whether those with a history of destructive behaviors before the pandemic (>1 year ago) had a higher proportion of positive COVID-19 tests or diagnoses than those who never experienced destructive behaviors. Those who reported prepandemic gambling and drug selling reported a higher proportion of positive COVID-19 tests (test+; Table 6), and those who reported prepandemic attention problems, bullying, gambling, isolation related to substance use, and destruction of property reported more COVID-19 diagnoses (diagnosis; Table 6). These findings suggest that the participants who exhibited these behaviors before the pandemic had higher reports of SARS-CoV-2 infection. Research on other infectious diseases (eg, sexually transmitted diseases) has implicated similar risky or destructive behaviors in infection risk, including antisocial behaviors [11,12], rebelliousness [10], deviant behavior, aggression, and alcohol and drug use [13,15].
Subhypothesis anytime and never vs. >1 year tested whether a history of destructive behaviors before the pandemic (>1 year ago) resulted in a higher proportion of positive COVID-19 tests or diagnoses than those who experienced behaviors during the pandemic or those who never experienced destructive behaviors. The results closely mirrored those from the subhypothesis never vs. >1 year, although fewer behaviors were observed overall (Table 7). These results emphasize the pervasiveness of gambling and drug selling behaviors across all 3 subhypotheses and suggest that major problems relating to gambling and illegal drug distribution may greatly impact a person's risk of SARS-CoV-2 infection.
Results from these proportion tests were subsequently used to build MVLR models and test how well GAIN-SS behaviors and covariates modeled, or predicted, the incidence of COVID-19.

Destructive Behaviors Predict COVID-19 Infection
MVLR with iterative downsampling was used to test the predictive accuracy of destructive behaviors in modeling SARS-CoV-2 infection (Table 8). Ample research suggests that MVLR can either outperform or produce similar results to those produced by other machine learning (ML) approaches [60,61]. Proportion test results were used to select the predictors (independent variables) for each model. For all models, accuracies, sensitivities, specificities, PPVs, NPVs, and AUROCs were higher when modeling diagnosis than when modeling test+, and the behaviors included in test+ models were always a subset of those included in diagnosis models. The outperformance of diagnosis models could be because of (1) the fact that COVID-19 tests were not initially widespread at the start of the pandemic and (2) the frequency of false positive COVID-19 tests. The participants who falsely tested positive for SARS-CoV-2 but did not exhibit GAIN-SS behaviors may have been included in the analyses, thereby skewing the data. Other scenarios are also possible where those who reported, or did not report, GAIN-SS behaviors may have (1) not been tested for SARS-CoV-2 but were infected or (2) been tested but received a false negative result. Given these considerations, a clinical diagnosis by a physician may have been more accurate to represent the true incidence of SARS-CoV-2 infection in this cohort.
Overall, the highest accuracy for modeling diagnosis was 95.55% and resulted when "attention problems" was included as a predictor in the never vs. >1 year model (Table 8). Previous research has implicated attention-deficit/hyperactivity disorder in COVID-19 risk, particularly in women [9]. This has been linked to that fact that those struggling with attention-deficit/hyperactivity disorder may have poorer access to health care, be living in population-dense environments, or have comorbid conditions [62]. Our results support these findings and demonstrate that a history of attention problems, as identified with the GAIN-SS, is an important predictor of COVID-19 diagnosis. This result was not confounded by covariates, given that the highest accuracy was 77.9% when only age, income, and education level were included in the model (Table S4 in Multimedia Appendix 1).
Gambling (accuracy=83.1%) and bullying (accuracy=83.7%) were also important predictors of diagnosis in the never vs. >1 year model. Problematic gambling and its relation to COVID-19 was discussed in the previous section. Although there is ample evidence of increased bullying during the COVID-19 pandemic [63,64], there is a lack of research identifying bullying behaviors as a potential COVID-19 risk factor. Given its importance in predicting COVID-19 diagnosis, we posit that persons who demonstrate this destructive behavior are more willing to risk the repercussions of their actions, which could translate to participating in activities that increase their risk for infection.
Accuracy was also high (95.1%) when 17 of the 20 destructive behaviors were included in the never vs. anytime model, suggesting that the demonstration of a wide variety of destructive behaviors both before and during the pandemic may be important COVID-19 risk factors. However, this data set lacked calendar dates for the reported positive COVID-19 tests (test+) and diagnoses (diagnosis), making it difficult to ascertain whether these behaviors were risk factors, or consequences, of SARS-CoV-2 infection.

Limitations
This sample consisted of 366 participants, which is small for a population sample, and the sample size of those who responded "yes" for COVID-19 was approximately 10% (34-36/366), consistent with population estimates of COVID-19 in the United States at the time of data collection. Future work needs to assess larger population samples. Our sample was predominantly White (245/366, 66.9%), which is close to the current population estimates. Future work with a larger sample could ensure adequate sampling of population diversity. Downsampling may also be regarded as a limitation, as it reduces the count of training samples falling under the majority class to balance the counts of target categories. By removing some of the collected data, valuable information may be lost. However, resampling the downsampled majority class 1000 times facilitates sampling across the entire distribution, which may prevent some loss of information. The average of these resampled majority classes can thus represent the larger distribution. Finally, MVLR results do not represent true prediction, which would require adequately sized training and test sets; however, ample research demonstrates cases where MVLR either outperforms or mirrors results from widely used ML approaches [60,61]. Future work should incorporate larger sample sizes and should implement alternative ML approaches. These caveats aside, it must be noted that this study used iterative resampling to overcome a major confound that is common in current ML papers, namely overfitting [65,66].

Conclusions
Results from this preliminary study implicate destructive and risk-taking behaviors in contracting SARS-CoV-2, specifically when the participants exhibited such behaviors before the pandemic. In general, gambling and selling drugs were most consistently observed in these relationships, but when modeling COVID-19 diagnosis, attention problems were also observed. The relevance of destructive and risk-taking behaviors in infection prediction suggests the importance of mitigation-related public messaging (eg, announcements and posters) at drug treatment centers and organizations involved in gambling (eg, casinos and bars). Making behavioral interventions more broadly available for those with destructive-type behavioral issues might also be important in this regard. Future work is needed to assess how these risk-taking behaviors might relate risk-taking in the context of SARS-CoV-2 infection (attending large gatherings, not wearing a mask, etc).