TY - JOUR AU - Azizi, Mehrnoosh AU - Jamali, Ali Akbar AU - Spiteri, Raymond J PY - 2024 DA - 2024/6/4 TI - Identifying X (Formerly Twitter) Posts Relevant to Dementia and COVID-19: Machine Learning Approach JO - JMIR Form Res SP - e49562 VL - 8 KW - machine learning KW - dementia KW - Alzheimer disease KW - COVID-19 KW - X (Twitter) KW - natural language processing AB - Background: During the pandemic, patients with dementia were identified as a vulnerable population. X (formerly Twitter) became an important source of information for people seeking updates on COVID-19, and, therefore, identifying posts (formerly tweets) relevant to dementia can be an important support for patients with dementia and their caregivers. However, mining and coding relevant posts can be daunting due to the sheer volume and high percentage of irrelevant posts. Objective: The objective of this study was to automate the identification of posts relevant to dementia and COVID-19 using natural language processing and machine learning (ML) algorithms. Methods: We used a combination of natural language processing and ML algorithms with manually annotated posts to identify posts relevant to dementia and COVID-19. We used 3 data sets containing more than 100,000 posts and assessed the capability of various algorithms in correctly identifying relevant posts. Results: Our results showed that (pretrained) transfer learning algorithms outperformed traditional ML algorithms in identifying posts relevant to dementia and COVID-19. Among the algorithms tested, the transfer learning algorithm A Lite Bidirectional Encoder Representations from Transformers (ALBERT) achieved an accuracy of 82.92% and an area under the curve of 83.53%. ALBERT substantially outperformed the other algorithms tested, further emphasizing the superior performance of transfer learning algorithms in the classification of posts. Conclusions: Transfer learning algorithms such as ALBERT are highly effective in identifying topic-specific posts, even when trained with limited or adjacent data, highlighting their superiority over other ML algorithms and applicability to other studies involving analysis of social media posts. Such an automated approach reduces the workload of manual coding of posts and facilitates their analysis for researchers and policy makers to support patients with dementia and their caregivers and other vulnerable populations. SN - 2561-326X UR - https://formative.jmir.org/2024/1/e49562 UR - https://doi.org/10.2196/49562 UR - http://www.ncbi.nlm.nih.gov/pubmed/38833288 DO - 10.2196/49562 ID - info:doi/10.2196/49562 ER -