TY - JOUR AU - Chen, Donghao AU - Wang, Pengfei AU - Zhang, Xiaolong AU - Qiao, Runqi AU - Li, Nanxi AU - Zhang, Xiaodong AU - Zhang, Honggang AU - Wang, Gang PY - 2025 DA - 2025/5/30 TI - Comparative Efficacy of MultiModal AI Methods in Screening for Major Depressive Disorder: Machine Learning Model Development Predictive Pilot Study JO - JMIR Form Res SP - e56057 VL - 9 KW - major depressive disorder KW - artificial intelligence KW - computational psychiatry KW - facial action unit KW - multimodal analysis KW - multiparadigm analysis KW - MDD AB - Background: Conventional approaches for major depressive disorder (MDD) screening rely on two effective but subjective paradigms: self-rated scales and clinical interviews. Artificial intelligence (AI) can potentially contribute to psychiatry, especially through the use of objective data such as objective audiovisual signals. Objective: This study aimed to evaluate the efficacy of different paradigms using AI analysis on audiovisual signals. Methods: We recruited 89 participants (mean age, 37.1 years; male: 30/89, 33.7%; female: 59/89, 66.3%), including 41 patients with MDD and 48 asymptomatic participants. We developed AI models using facial movement, acoustic, and text features extracted from videos obtained via a tool, incorporating four paradigms: conventional scale (CS), question and answering (Q&A), mental imagery description (MID), and video watching (VW). Ablation experiments and 5-fold cross-validation were performed using two AI methods to ascertain the efficacy of paradigm combinations. Attention scores from the deep learning model were calculated and compared with correlation results to assess comprehensibility. Results: In video clip-based analyses, Q&A outperformed MID with a mean binary sensitivity of 79.06% (95%CI 77.06%‐83.35%; P=.03) and an effect size of 1.0. Among individuals, the combination of Q&A and MID outperformed MID alone with a mean extent accuracy of 80.00% (95%CI 65.88%‐88.24%; P= .01), with an effect size 0.61. The mean binary accuracy exceeded 76.25% for video clip predictions and 74.12% for individual-level predictions across the two AI methods, with top individual binary accuracy of 94.12%. The features exhibiting high attention scores demonstrated a significant overlap with those that were statistically correlated, including 18 features (all Ps<.05), while also aligning with established nonverbal markers. Conclusions: The Q&A paradigm demonstrated higher efficacy than MID, both individually and in combination. Using AI to analyze audiovisual signals across multiple paradigms has the potential to be an effective tool for MDD screening. SN - 2561-326X UR - https://formative.jmir.org/2025/1/e56057 UR - https://doi.org/10.2196/56057 DO - 10.2196/56057 ID - info:doi/10.2196/56057 ER -