TY - JOUR AU - Pemu, Priscilla AU - Prude, Michael AU - McCaslin, Atuarra AU - Ojemakinde, Elizabeth AU - Awad, Christopher AU - Igwe, Kelechi AU - Rodriguez, Anny AU - Foriest, Jasmine AU - Idris, Muhammed PY - 2025 DA - 2025/7/17 TI - Detecting Conversation Topics in Recruitment Calls of African American Participants to the All of Us Research Program Using Machine Learning: Model Development and Validation Study JO - JMIR Form Res SP - e65320 VL - 9 KW - recruitment KW - clinical trials KW - diversity KW - trust KW - precision medicine AB - Background: Advancements in science and technology can exacerbate health disparities, particularly when there is a lack of diversity in clinical research, which limits the benefits of innovations for underrepresented communities. Programs like the All of Us Research Program (AoURP) are actively working to address this issue by ensuring that underrepresented populations are represented in biomedical research, promoting equitable participation, and advancing health outcomes for all. African American communities have been particularly underrepresented in clinical research, often due to historical instances of research misconduct, such as the Tuskegee Syphilis Study, which have deeply impacted trust and willingness to participate in research studies. With the US population becoming increasingly diverse, it is crucial that clinical research studies reflect this diversity to improve health outcomes. However, limited data and small sample sizes in qualitative studies on the inclusion of underrepresented groups hinder progress in this area. Objective: The goal of this paper is to analyze recruitment conversations between research assistants (RAs) and potential participants in the AoURP to identify key topics that influence enrollment. By examining these interactions, we aim to provide insights that can improve engagement strategies and recruitment practices for underrepresented groups in biomedical research. Methods: Our study design was an observational, retrospective approach using machine learning for content analysis. Specifically, we used structural topic modeling to identify and compare latent topics of conversation in recruitment calls by Morehouse School of Medicine RAs between February 2021 and April 2022 by estimating expected topic proportions in the corpus as a function of enrollment and participation in AoURP. Results: In total, our model estimated 45 topics of which 12 coherent topics were identified. Notable topics, that were more likely to occur in conversations between RAs and participants that enrolled and participated, include closing or following up to schedule an appointment, COVID-19 protocols for in-person visits, explaining precision medicine and the need for representation, and working through objections, including concerns about costs, insurance, care changes, and health fears. Topics among potential participants who did not enroll include technical challenges and describing physical measurement visits (eg, collection of basic physical data, such as height, weight, and blood pressure). Conclusions: Using an approach that leverages machine learning to identify topical structure and themes with limited human subjectivity is a promising strategy to identify gaps in, and opportunities to improve, the recruitment of underserved communities into clinical trials. SN - 2561-326X UR - https://formative.jmir.org/2025/1/e65320 UR - https://doi.org/10.2196/65320 DO - 10.2196/65320 ID - info:doi/10.2196/65320 ER -