Published on in Vol 8 (2024)

Preprints (earlier versions) of this paper are available at, first published .
Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study

Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study

Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study

Authors of this article:

Aditi Jaiswal1 Author Orcid Image ;   Peter Washington1 Author Orcid Image

Original Paper

Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States

Corresponding Author:

Aditi Jaiswal, MS

Department of Information and Computer Sciences

University of Hawaii at Manoa

Room 312C, Pacific Ocean Science and Technology

1680 East-West Road

Honolulu, HI, 96822

United States

Phone: 1 8088296359


Related ArticlesThis is a corrected version. See correction statement in:
Comment in:

Background: The increasing use of social media platforms has given rise to an unprecedented surge in user-generated content, with millions of individuals publicly sharing their thoughts, experiences, and health-related information. Social media can serve as a useful means to study and understand public health. Twitter (subsequently rebranded as “X”) is one such social media platform that has proven to be a valuable source of rich information for both the general public and health officials. We conducted the first study applying Twitter data mining to autism screening.

Objective: We aimed to study the feasibility of autism screening from Twitter data and discuss the ethical implications of such models.

Methods: We developed a machine learning model to attempt to distinguish individuals with autism from their neurotypical peers based on the textual patterns from their public communications on Twitter. We collected 6,515,470 tweets from users’ self-identification with autism using “#ActuallyAutistic” and a separate control group. To construct the data set, we targeted English-language tweets using the search query “#ActuallyAutistic” posted from January 1, 2014 to December 31, 2022. We encrypted all user IDs and stripped the tweets of identifiable information such as the associated email address prior to analysis. From these tweets, we identified unique users who used keywords such as “autism” OR “autistic” OR “neurodiverse” in their profile description and collected all the tweets from their timelines. To build the control group data set, we formulated a search query excluding the hashtag “#ActuallyAutistic” and collected 1000 tweets per day during the same time period. We trained a word2vec model and an attention-based, bidirectional long short-term memory model to validate the performance of per-tweet and per-profile classification models. We deleted the data set and the models after our analysis.

Results: Our tweet classifier reached a 73% accuracy, a 0.728 area under the receiver operating characteristic curve score, and an 0.71 F1-score using word2vec representations fed into a logistic regression model, while the user profile classifier achieved an 0.78 area under the receiver operating characteristic curve score and an F1-score of 0.805 using an attention-based, bidirectional long short-term memory model.

Conclusions: We have shown that it is feasible to train machine learning models using social media data to predict use of the #ActuallyAutistic hashtag, an imperfect proxy for self-reported autism. While analyzing textual differences in naturalistic text has the potential to help clinicians screen for autism, there remain ethical questions that must be addressed for such research to move forward and to translate into the real world. While machine learning has the potential to improve behavioral research, there are still a plethora of ethical issues in digital phenotyping studies using social media with respect to user consent of marginalized populations. Achieving this requires a more inclusive approach during the model development process that involves the autistic community directly in the ideation and consent processes.

JMIR Form Res 2024;8:e52660



Millions of individuals are autistic. A core complexity of autism lies in its dynamic profile that changes with age, often leading to the misattribution of behavioral characteristics to other conditions such as anxiety and obsessive-compulsive disorder [1,2]. Unfortunately, there are limitations on the availability of standard tests [3], leading to misdiagnoses or delayed services [4], often leading to negative outcomes later in life [5]. Social media has been proposed as a means for real-time public health monitoring, offering insights into individuals’ thoughts, emotions, behaviors, and daily struggles. Such nonclinical data can potentially enable clinicians and researchers to develop early screening tools in a less invasive manner. This digital footprint can be analyzed to study the linguistic characteristics of autism and other developmental delays [6]. However, this potential for social good may be outweighed by the salient possibility of harm.

In recent years, social media has emerged as a promising tool for mining behavioral and observational data. The collection of digital data from social media, wearable devices, and smartphones holds potential for improving health care. Research in mental health, such as identifying depression and mood changes [7-13] and real-time mapping of natural disasters [14,15] or infectious disease spread and its effect on emotional health [16-23] has greatly benefited from such “digital phenotyping” studies. Among social media platforms, Twitter (subsequently rebranded as “X”), known for its concise microblogging nature with tweets limited to 280 characters, has emerged as a valuable source of personalized data, boasting an active monthly user base of around 450 million individuals [24].

Autism has been the subject of multiple clinical trials, reviews, and epidemiological studies conducted using behavioral features such as eye gaze [25], prosody [26], asynchronous body movement [27], facial expressions [28,29], mobile phone data [30-33], or even electroencephalograms [34]. However, only a handful of studies have used social analytical tools [35-38], especially Twitter [39-41], for investigating autism. In addition, other social networking sites such as Reddit [42-45], Facebook [46], Instagram [47,48], Flickr [49], and Sina Weibo [50] have also provided a valuable source of data for detecting and studying mental health conditions, substance abuse, and risky behaviors. Using these previous works as inspiration, we curated a novel, extensive Twitter data set to study various aspects of social communication that differentiate people with autism from their neurotypical peers on a larger scale than previous work.

Our goal was to build a classifier to aid in affordable autism screening using Twitter data, enabling support for communities with limited access to diagnostic resources. While we were able to build such a model with reasonable predictive power for a first pass at this task, we note that we did not obtain explicit consent from the study population. We therefore deleted all the data and models that we developed after the completion of our analysis. Given the potential of such research to harm user privacy and the lack of consent, we discuss the ethical implications of this research. We note that the availability of the resulting models has the potential to promote unethical practices that can occur for more malicious purposes, such as profiling of individuals by medical insurance companies, use by colleges to assess applicants, and surveillance of citizens by governments. We therefore caution researchers and practitioners against building such models without obtaining explicit consent and practicing participatory community-centered research prior to model development.


Here, we describe the data set curation process (Figure 1), preprocessing steps, and a series of analyses on the curated data. We started by analyzing the sentiments and topics within the data set to discover some qualitative insights. We then performed per-tweet and per-user classifications of autism to understand the linguistic differences between the users in the autism and control groups.

Figure 1. Pipeline for the creation of the novel Twitter (subsequently rebranded as “X”) autism data set. API: application programming interface; VADER: Valence Aware Dictionary for Sentiment Reasoning.

Data Collection

In recent years, hashtags such as #MeToo, #BlackLivesMatter, and #StopAsianHate have played significant roles in promoting social movements and campaigns, including those aimed at raising awareness about specific societal issues. Within the autism community, popular hashtags such as #AutismMom and #AutismParent have represented the perspectives of neurotypical parents, significantly influencing research and policies in this domain. However, these advocacy groups often overshadow adults with autism, creating a gap in their representation within decision-making processes. To address this issue, a paradigm shift occurred in the autism rights movement through the hashtag “#ActuallyAutistic” [51,52]. This movement has emphasized understanding the experiences, challenges, and perspectives of individuals on the autism spectrum, redirecting attention toward them rather than solely focusing on caregivers.

Using the hashtag as the criteria for our corpus selection, we extracted Twitter conversations of users self-identifying with autism to study the differences in their linguistic patterns. Our data extraction involved using snscrape (JustAnotherArchivist) [53], a Python-based library allowing social media scraping without requiring personal Twitter application programming interface keys and providing powerful search functionality to help filter tweets based on various conditions, such as date-time, language, or location. We targeted English-language tweets using “#ActuallyAutistic” posted from January 1, 2014, to December 31, 2022. To identify users self-identifying with autism, we searched for keywords such as “autism,” “autistic,” or “neurodiverse” within their profile descriptions (bios). Additionally, we considered usernames and tweet contents for users who used these keywords solely in their usernames. Finally, we extracted all the tweets from the timelines of these users to construct the autism data set, which consists of 3,137,952 tweets from 17,323 individuals. Associated metadata such as username, account created, friend count, and date of tweets posted were also extracted and could be used for statistical or network analysis.

To build a tweet classifier for individuals with autism and their neurotypical peers, we collected a sample of random tweets as part of the control group. To achieve this, we formulated a search query excluding the hashtag, “-#ActuallyAutistic,” using the advanced query searching operators and methods provided by Dr Igor Brigadir [54]. However, this approach carries the risk of data leakage, whereby users who have not posted any autism-related content may possess autism-related keywords in their profile description or username. To avoid this, we screened users who had any such keywords in their profile description or usernames, or who were also present in the autism data set, and subsequently removed them from the sample. We collected 1000 control tweets per day during the same time period to obtain a total of 3,377,518 tweets across 171,273 individual users.

Data Labeling

To train a supervised machine learning model effectively, labeled data that associate each data point with a respective class are crucial. We automatically labeled the tweets from the autism data set as belonging to the class “autism,” assigned label 1. All other tweets from the control group data set were labeled as belonging to the class “control group,” assigned label 0. However, it is important to clarify that these tweet labels were used temporarily for classification purposes and were not permanently stored in the data set. It is important to note that obtaining ground-truth labels can be a costly and time-consuming process, and the performance of machine learning models is often found to decrease with a decrease in labeled data set size. Weak supervision approaches leverage partially accurate or noisy sources for annotations, which can be more efficient than manual labeling.

Data Preprocessing

Working with raw, unstructured Twitter data is challenging because the conversational text contains too many noisy elements, such as punctuation, abbreviations, emojis, and other stray characters. Thus, before using such data for model training, it is necessary to clean and preprocess the data, which is an essential step for any natural language processing task. We started by removing the usage of any profane language in the tweets, such as cursing or swear words, using a Python library called better-profanity [55], which is designed to flag inappropriate words using string comparison and mask them using special characters (the default setting uses “*”). While profane language can sometimes be highly emotive and help in understanding the sentiments of a text, we chose to censor any such words while classifying the tweets, as such words can be used by any individual and might not help in classification tasks. However, we considered the contribution of profane language through sentiment analysis and observed that the polarity of the sentiments was almost similar when using clean and uncensored tweets.

We then tokenized the text into words; removed any nonalphanumeric characters, hyperlinks, user mentions, and HTML tags; and converted the word tokens into lower case to avoid any confusion and data redundancy. We removed stop words to avoid adding noise and complexity to the features with no meaningful information. To further simplify the input space and normalize the vocabulary, we applied stemming and lemmatization. We also removed any hashtags or a list of keywords related to autism such as “actuallyautistic,” “autism,” “autistic,” “autismacceptance,” “autismawareness,” “askingautistics,” “askingautistic,” “neurodiversity,” “neurodivergent,” “allautistics,” “adhd,” “mentalhealth,” “autism,” “diagnosis,” “autistics,” “autismpride,” and “autismspeaks,” which could introduce bias and lead to model overfitting.

Sentiment Labeling

We compared the sentiments of tweets posted by individuals with autism against those from the control group in order to understand the subjective characteristics and emotional polarity around the topic. Initially, we conducted sentiment analysis on the original data set, which contained profanity. Additionally, we wanted to explore how profane language can affect the sentiments of the tweets, and thus we also conducted sentiment analysis on a pseudoclean data set after removing any profane words. Sentiment analysis commonly involves 2 approaches: machine learning and lexical. We used the Valence Aware Dictionary for Sentiment Reasoning (VADER) [56], a lexical approach specifically attuned to sentiments expressed in social media or microblogs like context, to analyze the sentiments of the curated data set. VADER has been explicitly trained on social media data sets (such as social media posts or New York Times editorials) and requires no training data. VADER applies a set of rules and heuristics to the sentiment scores of the individual words to determine the overall sentiment of the sentence and returns a dictionary of negative, neutral, positive, and overall (normalized) sentiment scores for the sentence.

Topic Modeling

The objective of our topic modeling analysis was to investigate whether there exist specific themes and semantic patterns that are frequently discussed in relation to autism and can offer insights beneficial for clinicians and policymakers. Topic modeling is an unsupervised learning technique used to uncover concealed topics and coherent themes within textual data. We used the Top2Vec [57] algorithm, which offers a dynamic approach to discovering topics within a corpus of text data by making use of the spatial proximity of the words.

Tweet-Level Classification

Our initial focus involved training a model specifically designed to predict autism based on the content within individual tweets. To build this tweet classifier, we identified unique users from both the autism and control data sets, allocating an 85:15 split for training and testing purposes. Data splitting by user rather than by tweet avoids data leakage, where a user’s tweets might scatter across both training and testing sets, potentially leading to overfitting by the model due to learning user-specific patterns. The tweets, with no profanity, were preprocessed as defined in the previous section and formed the training and test sets. The categorical labels, representing whether a tweet belonged to a user in the autism or control group, were used as the basis for model training and evaluation. Additionally, the training data set underwent an 85:15 split, separating it into training and validation subsets, which was used to fine-tune the model and adjust hyperparameters.

For text-to-numeric vectorization, we used 2 approaches: a bag-of-words term frequency–inverse document frequency (TF-IDF) method and word2vec embeddings. We started by training TF-IDF feature representation using various classical machine learning algorithms: support vector machines, naive Bayes, logistic regression, and XGBoost (extreme gradient boosting), using 5-fold cross-validation and accuracy as the primary evaluation metric to identify the best classification method. We then trained the word2vec model using the best-identified algorithm for better feature representation. This approach captures both semantic and syntactic similarities among words, and we assessed its efficacy using a more comprehensive array of evaluation metrics.

User Profile Classification

Our subsequent task involved training a model to predict autism by considering all tweets from an individual user’s timeline. To ensure a more representative data set and prevent potential model overfitting, we isolated unique users who had shared a minimum of 5 tweets and split them into an 80:20 ratio for training and testing. The preprocessed tweets from each user were then grouped together to form an individual document. For model training, we used an attention-based, bidirectional long short-term memory (Bi-LSTM) model vectorized with a randomly initialized, self-trained embedding layer. As the tweets vary in their lengths and raw text cannot be directly represented as dense vectors in the way that images can, we used padding and an extra “unknown” token during tokenization to achieve the fixed length input and represent any unseen tokens.

Ethical Considerations

While social media data can help with public health analysis by offering a less intrusive and real-time monitoring approach for disease symptomatology and public sentiments, it also poses ethical challenges by exposing the users to harm or the potential leaking of personally identifiable information. First, this study was approved by all ethics-related regulatory bodies at the University of Hawaii. The study has been approved by the University of Hawaii Institutional Review Board (2023-00248) under an expedited review procedure, and the user information was deidentified. We also ran the request through University of Hawaii institutional data governance to approve this study, where it was determined that the study is exempt from further data governance review due to the inherently public nature of the study data. We also took additional measures not required by the Institutional Review Board. Specifically, we encrypted user IDs, reducing the chances of user reidentification. We also anonymized any user mentions or personal information, such as email addresses, contained within the tweets. These steps were aligned with the ethical considerations outlined in various research studies on social media analysis [58-60].

The public nature of social media data can often overshadow participants’ consent, leaving them unaware or unsure of the inclusion of their data in the research. Williams et al [61] observed that 84% of respondents were not at all or only slightly concerned about the use of the Twitter posts for university research. However, this leaves a considerable portion of the population who remains concerned. The conditions and privacy policies for data use are often long, with complex legal terms that the users may fail to understand or authorize, leaving them unaware of the consequences. While it can be impractical to obtain explicit consent on a per-study basis for large-scale social media analytics research, we recommend that the research community find ways to support large-scale consent procedures. This study highlights the need of a regulatory framework for social media data mining.

There remain concerns surrounding the ethics of social media analytics research on individuals with autism. While this study and previous studies typically safeguard user data by deidentifying and anonymizing metadata, there remains a risk of identifying users based on their posted content. This underscores the immediate need for the creation of ethical tools and methodologies that facilitate scientific research based on social media data while adhering to ethical principles. Due to these inherent risks, the data set and the model that we built using those data have been deleted. When the data set did exist, it was never shared outside of the original authors.

Data Records

The autism subset, collected from 17,323 self-reported individuals with autism, contains 3,137,952 tweets. The control subset, collected from 171,273 users, consists of 3,377,518 tweets. The combined data set contains the following columns: user ID (a unique value assigned to each Twitter account), profile description (a short summary of the account posted by the user), account created (date-time when the account was created), friends count (number of accounts the user follows), followers count (number of accounts the user is being followed by), tweet date (date-time when the tweet was posted), tweet ID (a unique ID assigned to each tweet), tweet text (original tweet), a list of hashtags present in each tweet, number of replies (number of times the tweet has been replied to), number of retweets (number of times the tweet was retweeted), number of likes the tweet got, and source from where the tweet was posted (web, mobile device, or app). While we focused on using tweet text as the primary source of data, other supporting metadata could be used in the future for network analysis or statistical studies.

Exploratory Data Analysis

The data set’s columns for hashtags and locations were found to contain the highest number of missing values during our analysis. While not all tweets are accompanied by hashtags or location details, users possess the liberty to input any desired location on their profiles. Our analysis revealed that a large portion of users either did not provide their actual location or had inconsistencies in their location entries. Among the top 20 location values identified, most were variations of “United Kingdom,” such as “UK,” “London, England,” “England, United Kingdom,” and “South East, England.” However, other entries were less informative and included phrases such as “Picnic party” and “My parent’s basement.” Due to the majority of the missing data and to safeguard users’ personal information, we opted to exclude the location column from the data set before using it to train any machine learning algorithms.

Further analysis of the yearly distribution of tweets revealed a rising trend in discussions related to autism across the years. This trend suggests that individuals on the autism spectrum are increasingly embracing social media platforms, potentially opening up numerous employment prospects and serving as an effective channel to educate the public about developmental delays. Additionally, sharing behavioral symptoms through social engagement could be beneficial to others to build better community support. This increased social involvement may hold significance not only in social science [62] but also in human-computer interaction research [63], offering insights to design more inclusive and efficient digital environments.

Sentiment Analysis

The VADER sentiments of most of the autism and control group tweets were found to be positive and neutral, respectively, as shown in Table 1.

This was supported by another interesting observation: tweets from individuals with autism comprised a higher character count compared to those from the control group (Figure 2). The histograms depicting the word counts in tweets from both groups follow similar distributions but with a substantial difference in their means. This disparity strongly suggests varying linguistic patterns between these 2 groups.

Table 1. Distribution of sentiments in the autism spectrum disorder and control group data sets.
Data set and VADERa sentimentsFrequency, n (%)
In original autism tweets (n=3,137,952)

Positive1,528,183 (48.7)

Negative812,730 (25.9)

Neutral797,039 (25.4)
In clean autism tweets (n=3,137,952)

Positive1,562,700 (49.8)

Negative756,247 (24.1)

Neutral819,005 (26.1)
In original control group tweets (n=3,377,518)

Positive1,280,080 (37.9)

Negative938,950 (27.8)

Neutral1,158,488 (34.3)
In clean control group tweets (n=3,377,518)

Positive1,323,987 (39.2)

Negative719,411 (21.3)

Neutral1,334,120 (39.5)

aVADER: Valence Aware Dictionary for Sentiment Reasoning.

Figure 2. Histograms of number of characters in the tweets for the 2 groups.

Topic Modeling

Using just the autism data set, multiple topics were discovered, and the word clouds of a few topics are shown in Multimedia Appendix 1.

As it can be seen, the majority of topics were related to behavioral and emotional symptoms such as “hyperactivity,” “fidgeting,” “depressed,” “anxiety,” “trembling,” and “overwhelmed.” Interestingly, a considerable number of documents also focused on terms such as “vaccine,” “therapy,” “misdiagnosis,” and “cats.” These findings may be attributed to the frequent misdiagnosis or delayed diagnosis of autism, prompting individuals to seek therapy, support, and guidance. The presence of vaccine-related discussions likely stems from misinformation and its negative impact on individuals affected by autism. However, given the time frame in which the data set was collected, it is also possible that these tweets are related to COVID-19 vaccines. Last, multiple studies [64,65] have found that children with autism are more at ease with cats due to their nonintrusive nature, lack of prolonged eye contact, and their ability to alleviate stress and interpret emotional cues.

Deriving specific topics from the control group’s Twitter conversations was challenging given their scattered and diverse nature. Most of these discussions centered around internet personalities, random conversations, specific days of the week, or special occasions such as birthdays and anniversaries. Interestingly, some broader topics related to animals surfaced in these conversations, but not as specifically focused as observed in autistic user conversations—specifically mentioning cats. Some of these posts also displayed the use of emotional words, suggesting that pets or animals may provide therapeutic benefits.

Technical Validation

The performance metrics for tweet classification are shown in Tables 2 and 3. Table 2 displays the results from TF-IDF feature representations across several classical machine learning models, while Table 3 displays the results using word2vec feature vectors trained with logistic regression. While the TF-IDF vectorization yielded similar accuracy using different machine learning algorithms for tweet classification, logistic regression was chosen as the best predictor due to its superior performance and shorter training time. The results of the word2vec model were found to be consistent with the semantic similarities of the words. For instance, “autism” exhibited higher cosine similarity to terms such as “Aspergers,” “neuroatypical,” and “autism spectrum condition,” indicating the model’s proficiency in capturing semantic relationships between words.

Table 4 displays the results for user classification. Although there is a class imbalance in the number of users with autism versus controls, the attention-based LSTM model still seems to yield better measures, with F1-scores of 0.7 and 0.9 on the “autism” and “control group” classes, respectively, and an AUC score of 0.78.

Table 2. Summary of results obtained for tweet classification from term frequency–inverse document frequency vectorization to identify the best algorithm based on accuracy.
ModelValidation set accuracy
Support vector machine0.615
Naive Bayes0.598
Logistic regression0.63

aXGBoost: extreme gradient boosting.

Table 3. Summary of results obtained for tweet classification from the word2vec model using the highest performing model, logistic regression.
Metric performance on test setValue
AUCa score0.728

aAUC: area under the receiver operating characteristic curve.

Table 4. Summary of results obtained for user classification from Keras embedding using the attention+Bi-LSTMa model.
Metric performance on test setValue
AUCb score0.78

aBi-LSTM: bidirectional long short-term memory.

bAUC: area under the receiver operating characteristic curve.


The shift in society’s reliance on social media for information, in contrast to traditional news sources, along with the immense volume of generated data, has resulted in an increased focus on the use of natural language processing for social media analytics. While research tools using facial expressions [6,66-75] and eye gazing for phenotyping autism [76,77] are promising, there exists a current deficiency in standardizing precise methods for assessing deviations from typical social interactions. The F1-scores of 0.71 in tweet classification and 0.80 in user classification signify substantial semantic distinctions in messages posted by individuals who did and did not post using the #ActuallyAutistic hashtag. Tweets by individuals using the hashtag demonstrated a higher frequency of emotional language, corroborated by the word2vec model’s stronger semantic associations among such words, reinforcing the model’s predictive capability. This finding, coupled with previous studies using computer vision models [76,78], suggests that digital phenotyping using social media could be used to support effective autism screening strategies and to facilitate early detection. Organizations such as the National Institutes of Health are actively funding research [79,80] using data from social media coupled with novel AI–based tools to improve public health surveillance and precision diagnostics, and these organizations are emphasizing the importance of maintaining ethical practices during the process.

We would also like to highlight that any social media analytics research should always be supported by ethical practices and an adherence to user privacy. As user data on social media platforms are often openly available, it is critical to obtain user consent when building AI models for marginalized communities. Without consent, use of the data may put some marginalized communities at risk of data leakage or potential misuse of their data. In the past, there have also been numerous cases of public data being used for training large language models (LLMs) without informed consent. While LLMs have revolutionized the field of AI, such practices highlight the need for regulations in the consent process and updating users about the use of their data in a simple and transparent manner. This study has helped us learn more about the use of social media for different AI-based research and the urgent need to integrate the community in the research process. Such integration will not only lead to an effective early screening tool but will also enable the maintenance of an ethical, privacy-protected system.


There are certain limitations to consider in this study. While we focused on individuals who self-identified as autistic, there is no clinical validation for their diagnosis. Annotations from clinical experts or crowdsourcing can help. Furthermore, there is a possibility of data leakage, where the identified users may not be autistic but instead could be family members, parents, caregivers, or advocacy organizations belonging to a different study population and still using the hashtags. However, the frequency of this type of leakage is predicted to be rare due to the negative social connotations of using #ActuallyAutistic without a diagnosis. There might also be a possibility of some data leakage of an individual with autism falling into the control group, but with autism having a prevalence rate of <3%, the model performance should not degrade by more than 3% if an individual who chose not to self-identify themselves with autism falls into the control group cohort. The predictive power of social media is not to be used at an individual level but at a broader cross-sectional level, possibly combined with self-reported questionnaires for enhanced accuracy in neurological studies.

In addition, the sentiment polarity obtained through VADER may lack accuracy compared to human-labeled sentiments, as human sentiments are influenced by various factors such as surroundings and politics, making reliable labeling challenging. Moreover, this study only considered the English language, potentially missing out on information from other countries or languages that could aid the model in making better predictions. This also raises concerns about the lack of diversity in the data [81], where only English-speaking users from higher socioeconomic groups or younger adults are represented in the data set, as they comprise a larger portion of Twitter users.

Nevertheless, the primary limitation of this study is that we were unable to obtain explicit consent from our study population. Because of this limitation, we have deleted the models and the data set, and we highlight the potential misuses of this model by more malicious actors. For example, the model could be used for admissions decisions to universities, hiring decisions, government surveillance programs, or even more nefarious purposes. We keep this paper as a case study of what is currently possible with publicly available social media data, and we encourage the research community and other AI innovators to think throughly about protections against harm against marginalized communities.

Future Work

We recommend that the research community pause before conducting further research on social media–based predictive models for autism. Interesting avenues for future work include (1) developing strategies for obtaining explicit consent on a large scale on social media and (2) conducting surveys of the autistic community to understand whether and how social media analytics may be useful.


The technical support and advanced computing resources from University of Hawaii Information Technology Services—Cyberinfrastructure, funded in part by the National Science Foundation Campus Cyberinfrastructure awards #2201428 and #2232862, are gratefully acknowledged. We used the generative artificial intelligence tool ChatGPT by OpenAI [82] only to edit the grammar of the manuscript.

Authors' Contributions

AJ was responsible for data collection, data analysis, and manuscript writing—the original draft. PW conceptualization, supervision, and manuscript reviewing and editing.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Topics observed in autism spectrum disorder data set using the Top2Vec algorithm.

DOCX File , 221 KB

  1. Cath DC, Ran N, Smit JH, van Balkom AJLM, Comijs HC. Symptom overlap between autism spectrum disorder, generalized social anxiety disorder and obsessive-compulsive disorder in adults: a preliminary case-controlled study. Psychopathology. 2008;41(2):101-110. [FREE Full text] [CrossRef] [Medline]
  2. Zandt F, Prior M, Kyrios M. Repetitive behaviour in children with high functioning autism and obsessive compulsive disorder. J Autism Dev Disord. 2007;37(2):251-259. [FREE Full text] [CrossRef] [Medline]
  3. Ning M, Daniels J, Schwartz J, Dunlap K, Washington P, Kalantarian H, et al. Identification and quantification of gaps in access to autism resources in the United States: an infodemiological study. J Med Internet Res. 2019;21(7):e13094. [FREE Full text] [CrossRef] [Medline]
  4. Lord C, Risi S, DiLavore PS, Shulman C, Thurm A, Pickles A. Autism from 2 to 9 years of age. Arch Gen Psychiatry. 2006;63(6):694-701. [FREE Full text] [CrossRef] [Medline]
  5. Weinstock CP. The deep emotional ties between depression and autism. Spectrum. 2019. URL: https:/​/www.​​features/​deep-dive/​the-deep-emotional-ties-between-depression-and-autism/​ [accessed 2024-01-12]
  6. Washington P, Wall DP. A Review of and roadmap for data science and machine learning for the neuropsychiatric phenotype of autism. Annu Rev Biomed Data Sci. 2023;6:211-228. [FREE Full text] [CrossRef] [Medline]
  7. Coppersmith G, Dredze M, Harman C, Hollingshead K. From ADHD to SAD: analyzing the language of mental health on Twitter through self-reported diagnoses. 2015. Presented at: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality; June 5, 2015:1-10; Denver, CO. URL: [CrossRef]
  8. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol. 2009;29(1):24-54. [FREE Full text] [CrossRef]
  9. Hswen Y, Naslund JA, Brownstein JS, Hawkins JB. Online communication about depression and anxiety among Twitter users with schizophrenia: preliminary findings to inform a digital phenotype using social media. Psychiatr Q. 2018;89(3):569-580. [FREE Full text] [CrossRef] [Medline]
  10. Mowery D, Bryan C, Conway M. Feature studies to inform the classification of depressive symptoms from Twitter data for population health. arXiv. [FREE Full text]
  11. De Choudhury M, Counts S, Horvitz E. Predicting postpartum changes in emotion and behavior via social media. 2013. Presented at: CHI '13: CHI Conference on Human Factors in Computing Systems; April 27-May 2, 2013:3267-3276; Paris, France. URL: [CrossRef]
  12. De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. 2021. Presented at: The Seventh International AAAI Conference on Weblogs and Social Media (ICWSM-13); July 8-11, 2013:128-137; Cambridge, MA. URL: [CrossRef]
  13. Nadeem M. Identifying depression on Twitter. arXiv. [FREE Full text]
  14. Robinson B, Power R, Cameron M. An evidence based earthquake detector using Twitter. 2013. Presented at: Proceedings of the Workshop on Language Processing and Crisis Information 2013; October 14, 2013:1-9; Nagoya, Japan.
  15. Sakaki T, Okazaki M. Earthquake shakes Twitter users: real-time event detection by social sensors. 2010. Presented at: WWW '10: The 19th International World Wide Web Conference; April 26-30, 2010:851-860; Raleigh, NC. URL: [CrossRef]
  16. Chew C, Eysenbach G. Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS One. 2010;5(11):e14118. [FREE Full text] [CrossRef] [Medline]
  17. Prieto VM, Matos S, Álvarez M, Cacheda F, Oliveira JL. Twitter: a good place to detect health conditions. PLoS One. 2014;9(1):e86191. [FREE Full text] [CrossRef] [Medline]
  18. Signorini A, Segre AM, Polgreen PM. The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza a H1N1 pandemic. PLoS One. 2011;6(5):e19467. [FREE Full text] [CrossRef] [Medline]
  19. Kim EHJ, Jeong YK, Kim Y, Kang KY, Song M. Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news. J Inf Sci. 2016;42(6):763-781. [FREE Full text] [CrossRef]
  20. Culotta A. Towards detecting influenza epidemics by analyzing Twitter messages. 2010. Presented at: KDD '10: The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; July 25-28, 2010; Washington, DC. URL: [CrossRef]
  21. Aramaki E, Maskawa S, Morita M. Twitter catches the Flu: detecting influenza epidemics using Twitter. 2011. Presented at: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing; July 27–31, 2011; Edinburgh, Scotland. URL:
  22. Ye J. Pediatric mental and behavioral health in the period of quarantine and social distancing with COVID-19. JMIR Pediatr Parent. 2020;3(2):e19867. [FREE Full text] [CrossRef] [Medline]
  23. Gupta V, Jain N, Katariya P, Kumar A, Mohan S, Ahmadian A, et al. An emotion care model using multimodal textual analysis on COVID-19. Chaos Solitons Fractals. 2021;144:110708. [FREE Full text] [CrossRef] [Medline]
  24. Campbell S. How many people use Twitter in 2023? (Twitter statistics). The Small Business Blog. 2023. URL: [accessed 2024-01-12]
  25. Washington P, Park N, Srivastava P, Voss C, Kline A, Varma M, et al. Data-driven diagnostics and the potential of mobile artificial intelligence for digital therapeutic phenotyping in computational psychiatry. Biol Psychiatry Cogn Neurosci Neuroimaging. 2020;5(8):759-769. [FREE Full text] [CrossRef] [Medline]
  26. Chi NA, Washington P, Kline A, Husic A, Hou C, He C, et al. Classifying autism from crowdsourced semistructured speech recordings: machine learning model comparison study. JMIR Pediatr Parent. 2022;5(2):e35406. [FREE Full text] [CrossRef] [Medline]
  27. Lakkapragada A, Kline A, Mutlu OC, Paskov K, Chrisman B, Stockham N, et al. The classification of abnormal hand movement to aid in autism detection: machine learning study. JMIR Biomed Eng. 2022;7(1):e33771. [FREE Full text] [CrossRef]
  28. Wu C, Liaqat S, Helvaci H, Cheung SCS, Chuah CN, Ozonoff S, et al. Machine learning based autism spectrum disorder detection from videos. In: Healthcom. 2021. Presented at: IEEE International Conference on E-health Networking, Application & Services (HEALTHCOM); March 01-02, 2021; Shenzhen, China. URL: [CrossRef]
  29. Washington P, Kalantarian H, Kent J, Husic A, Kline A, Leblanc E, et al. Improved digital therapy for developmental pediatrics using domain-specific artificial intelligence: machine learning study. JMIR Pediatr Parent. 2022;5(2):e26760. [FREE Full text] [CrossRef] [Medline]
  30. Tariq Q, Daniels J, Schwartz JN, Washington P, Kalantarian H, Wall DP. Mobile detection of autism through machine learning on home video: a development and prospective validation study. PLoS Med. 2018;15(11):e1002705. [FREE Full text] [CrossRef] [Medline]
  31. Varma M, Washington P, Chrisman B, Kline A, Leblanc E, Paskov K, et al. Identification of social engagement indicators associated with autism spectrum disorder using a game-based mobile app: comparative study of gaze fixation and visual scanning methods. J Med Internet Res. 2022;24(2):e31830. [FREE Full text] [CrossRef] [Medline]
  32. Banerjee A, Mutlu OC, Kline A, Surabhi S, Washington P, Wall DP. Training and profiling a pediatric facial expression classifier for children on mobile devices: machine learning study. JMIR Form Res. 2023;7:e39917. [FREE Full text] [CrossRef] [Medline]
  33. Anzulewicz A, Sobota K, Delafield-Butt JT. Toward the autism motor signature: gesture patterns during smart tablet gameplay identify children with autism. Sci Rep. 2016;6:31107. [FREE Full text] [CrossRef] [Medline]
  34. Alotaibi N, Maharatna K. Classification of autism spectrum disorder from eeg-based functional brain connectivity analysis. Neural Comput. 2021;33(7):1914-1941. [FREE Full text] [CrossRef] [Medline]
  35. Newton AT, Kramer ADI, McIntosh DN. Autism online: a comparison of word usage in bloggers with and without autism spectrum disorders. 2009. Presented at: CHI '09: CHI Conference on Human Factors in Computing Systems; April 4-9, 2009; Boston, MA. URL: [CrossRef]
  36. Nguyen T, Duong T, Phung D, Venkatesh S. Affective, linguistic and topic patterns in online autism communities. In: Web Information Systems Engineering – WISE 2014. 2014. Presented at: 22nd International Conference on Web Information Systems Engineering, WISE 2021; October 26–29, 2021:474-488; Melbourne, Australia. URL: [CrossRef]
  37. van Schalkwyk GI, Marin CE, Ortiz M, Rolison M, Qayyum Z, McPartland JC, et al. Social media use, friendship quality, and the moderating role of anxiety in adolescents with autism spectrum disorder. J Autism Dev Disord. 2017;47(9):2805-2813. [FREE Full text] [CrossRef] [Medline]
  38. Bakombo S, Ewalefo P, Konkle ATM. The influence of social media on the perception of autism spectrum disorders: content analysis of public discourse on YouTube videos. Int J Environ Res Public Health. 2023;20(4):3246. [FREE Full text] [CrossRef] [Medline]
  39. Hswen Y, Gopaluni A, Brownstein JS, Hawkins JB. Using Twitter to detect psychological characteristics of self-identified persons with autism spectrum disorder: a feasibility study. JMIR Mhealth Uhealth. 2019;7(2):e12264. [FREE Full text] [CrossRef] [Medline]
  40. Corti L, Zanetti M, Tricella G, Bonati M. Social media analysis of Twitter tweets related to ASD in 2019-2020, with particular attention to COVID-19: topic modelling and sentiment analysis. J Big Data. 2022;9(1):113. [FREE Full text] [CrossRef] [Medline]
  41. Beykikhoshk A, Arandjelović O, Phung D, Venkatesh S, Caelli T. Using Twitter to learn about the autism community. Soc Netw Anal Min. 2015;5(1):1-17. [FREE Full text] [CrossRef]
  42. Shing HC, Nair S, Zirikly A, Friedenberg M, Daumé H, Resnik P. Expert, crowdsourced, and machine assessment of suicide risk via online postings. New Orleans, LA. Association for Computational Linguistics; 2018. Presented at: Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic; June 5, 2018:25-36; New Orleans, LA. URL: [CrossRef]
  43. Garg S, Taylor J, El Sherief M, Kasson E, Aledavood T, Riordan R, et al. Detecting risk level in individuals misusing fentanyl utilizing posts from an online community on Reddit. Internet Interv. 2021;26:100467. [FREE Full text] [CrossRef] [Medline]
  44. Tadesse MM, Lin H, Xu B, Yang L. Detection of depression-related posts in reddit social media forum. IEEE Access. 2019;7:44883-44893. [FREE Full text] [CrossRef]
  45. Bellon-Harn ML, Boyd RL, Manchaiah V. Applied behavior analysis as treatment for autism spectrum disorders: topic modeling and linguistic analysis of reddit posts. Front Rehabil Sci. 2021;2:682533. [FREE Full text] [CrossRef] [Medline]
  46. Schwartz HA, Eichstaedt J, Kern ML, Park G, Sap M, Stillwell D, et al. Towards assessing changes in degree of depression through Facebook. Association for Computational Linguistics; 2014. Presented at: Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality; June 27, 2014:118-125; Baltimore, MD. URL: [CrossRef]
  47. Reece AG, Danforth CM. Instagram photos reveal predictive markers of depression. EPJ Data Sci. 2017;6(1):15. [FREE Full text] [CrossRef]
  48. Hassanpour S, Tomita N, DeLise T, Crosier B, Marsch LA. Identifying substance use risk based on deep neural networks and Instagram social media data. Neuropsychopharmacology. 2019;44(3):487-494. [FREE Full text] [CrossRef] [Medline]
  49. Yang Y, Jia J, Zhang S, Wu B, Chen Q, Li J, et al. How do your friends on social media disclose your emotions? 2014. Presented at: The Twenty-Eighth AAAI Conference on Artificial Intelligence; July 27–31, 2014; Québec City, QC. URL: [CrossRef]
  50. Lin H, Jia J, Guo Q, Xue Y, Li Q, Huang J, et al. User-level psychological stress detection from social media using deep neural network. 2014. Presented at: MM '14: 2014 ACM Multimedia Conference; November 3-7, 2014; Orlando, FL. URL: [CrossRef]
  51. Guberman J. #ActuallyAutistic Twitter as a site for epistemic resistance and crip futurity. ACM Trans Comput-Hum Interact. 2023;30(3):1-34. [FREE Full text] [CrossRef]
  52. Zolyomi A, Ridley J, Tomer K. #ActuallyAutistic sense-making on Twitter. 2020. Presented at: ASSETS '20: The 22nd International ACM SIGACCESS Conference on Computers and Accessibility; October 26-28, 2020:1-4; Virtual event, Greece. [CrossRef]
  53. JustAnotherArchivist/snscrape. GitHub. URL: [accessed 2024-01-12]
  54. Brigadir I. Advanced search on Twitter. GitHub. URL: [accessed 2024-01-12]
  55. Nguyen S. snguyenthanh/better_profanity. GitHub. URL: [accessed 2024-01-12]
  56. Hutto C, Gilbert E. VADER: a parsimonious rule-based model for sentiment analysis of social media text. 2014. Presented at: Proceedings of the International AAAI Conference on Web and Social Media; June 5–8, 2023:216-225; Limassol, Cyprus. URL: [CrossRef]
  57. Angelov D. Top2Vec: distributed representations of topics. arXiv. [FREE Full text]
  58. Benton A, Coppersmith G, Dredze M. Ethical research protocols for social media health research. 2017. Presented at: Proceedings of the First Workshop on Ethics in Natural Language Processing; April 4, 2017:94-102; Valencia, Spain. [CrossRef]
  59. Fiesler C, Proferes N. “Participant” perceptions of Twitter research ethics. Soc Media Soc. 2018;4(1):2056305118763366. [FREE Full text] [CrossRef]
  60. Gold N. Using Twitter Data in Research Guidance for Researchers and Ethics Reviewers. 2020. URL: [accessed 2024-01-12]
  61. Williams ML, Burnap P, Sloan L. Towards an ethical framework for publishing Twitter data in social research: taking into account users' views, online context and algorithmic estimation. Sociology. 2017;51(6):1149-1168. [FREE Full text] [CrossRef] [Medline]
  62. Koteyko N, van Driel M, Vines J. Autistic sociality on Twitter: enacted affordances and affiliation strategies. Discourse Commun. 2022;16(4):385-402. [FREE Full text] [CrossRef]
  63. Chatterjee R, Fears NE, Lichtenberg G, Tamplain PM, Miller HL. Identity and discourse among #ActuallyAutistic Twitter users with motor differences. J Mot Learn Dev. 2023;11(3):525-540. [FREE Full text] [CrossRef]
  64. Carlisle GK, Johnson RA, Wang Z, Bibbo J, Cheak-Zamora N, Lyons LA. Exploratory study of cat adoption in families of children with autism: impact on children's social skills and anxiety. J Pediatr Nurs. 2021;58:28-35. [FREE Full text] [CrossRef] [Medline]
  65. Hart LA, Thigpen AP, Willits NH, Lyons LA, Hertz-Picciotto I, Hart BL. Affectionate interactions of cats with children having autism spectrum disorder. Front Vet Sci. 2018;5:39. [FREE Full text] [CrossRef] [Medline]
  66. Daniels J, Schwartz JN, Voss C, Haber N, Fazel A, Kline A, et al. Exploratory study examining the at-home feasibility of a wearable tool for social-affective learning in children with autism. NPJ Digit Med. 2018;1:32. [FREE Full text] [CrossRef] [Medline]
  67. Kalantarian H, Washington P, Schwartz J, Daniels J, Haber N, Wall DP. Guess what?: towards understanding autism from structured video using facial affect. J Healthc Inform Res. 2019;3(1):43-66. [FREE Full text] [CrossRef] [Medline]
  68. Kalantarian H, Jedoui K, Washington P, Wall DP. A mobile game for automatic emotion-labeling of images. IEEE Trans Games. 2020;12(2):213-218. [FREE Full text] [CrossRef] [Medline]
  69. Kalantarian H, Jedoui K, Washington P, Tariq Q, Dunlap K, Schwartz J, et al. Labeling images with facial emotion and the potential for pediatric healthcare. Artif Intell Med. 2019;98:77-86. [FREE Full text] [CrossRef] [Medline]
  70. Kalantarian H, Jedoui K, Dunlap K, Schwartz J, Washington P, Husic A, et al. The performance of emotion classifiers for children with parent-reported autism: quantitative feasibility study. JMIR Ment Health. 2020;7(4):e13174. [FREE Full text] [CrossRef] [Medline]
  71. Kalantarian H, Washington P, Schwartz J, Daniels J, Haber N, Wall D. A gamified mobile system for crowdsourcing video for autism research. 2018. Presented at: IEEE International Conference on Healthcare Informatics (ICHI); June 04-07, 2018; New York, NY. URL: [CrossRef]
  72. Kline A, Voss C, Washington P, Haber N, Schwartz H, Tariq Q, et al. Superpower glass. GetMobile: Mobile Comp Comm. 2019;23(2):35-38. [FREE Full text] [CrossRef]
  73. Washington P, Voss C, Kline A, Haber N, Daniels J, Fazel A, et al. SuperpowerGlass: a wearable aid for the at-home therapy of children with autism. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2017;1(3):1-22. [FREE Full text] [CrossRef]
  74. Voss C, Krüger A. Superpower glass. 2016. Presented at: UbiComp '16: The 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing; September 12-16, 2016; Heidelberg, Germany. URL: [CrossRef]
  75. Voss C, Schwartz J, Daniels J, Kline A, Haber N, Washington P, et al. Effect of wearable digital intervention for improving socialization in children with autism spectrum disorder: a randomized clinical trial. JAMA Pediatr. 2019;173(5):446-454. [FREE Full text] [CrossRef] [Medline]
  76. Drimalla H, Scheffer T, Landwehr N, Baskow I, Roepke S, Behnia B, et al. Towards the automatic detection of social biomarkers in autism spectrum disorder: introducing the Simulated Interaction Task (SIT). NPJ Digit Med. 2020;3:25. [FREE Full text] [CrossRef] [Medline]
  77. Ahmed IA, Senan EM, Rassem TH, Ali MAH, Shatnawi HSA, Alwazer SM, et al. Eye tracking-based diagnosis and early detection of autism spectrum disorder using machine learning and deep learning techniques. Electronics. 2022;11(4):530. [FREE Full text] [CrossRef]
  78. Alcañiz M, Chicchi-Giglioli IA, Carrasco-Ribelles LA, Marín-Morales J, Minissi ME, Teruel-García G, et al. Eye gaze as a biomarker in the recognition of autism spectrum disorder using virtual reality and machine learning: a proof of concept for diagnosis. Autism Res. 2022;15(1):131-145. [FREE Full text] [CrossRef] [Medline]
  79. Notice of Special Interest (NOSI): computational and statistical methods to enhance discovery from health data. National Institutes of Health. URL: [accessed 2024-01-12]
  80. Department of health and human services. National Institutes of Health. URL: [accessed 2024-01-12]
  81. Sloan L. Who Tweets in the United Kingdom? Profiling the Twitter population using the British social attitudes survey 2015. Soc Media Soc. 2017;3(1):2056305117698981. [FREE Full text] [CrossRef]
  82. Creating safe AGI that benefits all of humanity. OpenAI. URL: [accessed 2024-01-12]

AUC: area under the receiver operating characteristic curve
Bi-LSTM: bidirectional long short-term memory
TF-IDF: term frequency–inverse document frequency
VADER: Valence Aware Dictionary for Sentiment Reasoning
XGBoost: extreme gradient boosting

Edited by A Mavragani; submitted 11.09.23; peer-reviewed by N Di Cara, M Chatzimina; comments to author 03.11.23; revised version received 19.11.23; accepted 10.12.23; published 14.02.24.


©Aditi Jaiswal, Peter Washington. Originally published in JMIR Formative Research (, 14.02.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.