Search Articles

View query in Help articles search

Search Results (1 to 10 of 155 Results)

Download search results: CSV END BibTex RIS

CSV download: Download all 155 search results (up to 5,000 articles maximum)

Evaluation of the Accuracy, Usability, and User Perspectives of the Ecological Momentary Dietary Assessment App Traqq Among Dutch Adolescents: Protocol for a Mixed Methods Study

Evaluation of the Accuracy, Usability, and User Perspectives of the Ecological Momentary Dietary Assessment App Traqq Among Dutch Adolescents: Protocol for a Mixed Methods Study

The questionnaire, adapted from previous studies, assessed ease-of-use, convenience, reporting burden, perceived accuracy, likelihood of future use, and overall experience [36,37]. It included a question about parental assistance in home-cooked meals and app use, with responses recorded on a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree).

Lieke L E Kennes, Desiree A Lucassen, Anouk M M Vaes, Annemarie Wagemakers, Indre Kalinauskaite, Edith J M Feskens, Elske M Brouwer-Brolsma

JMIR Res Protoc 2025;14:e70194


Performance of the Large Language Models on the Chinese National Nurse Licensure Examination: Cross-Sectional Evaluation Study

Performance of the Large Language Models on the Chinese National Nurse Licensure Examination: Cross-Sectional Evaluation Study

The results are organized by the primary performance dimensions assessed: accuracy across different question types and exam sections, repeatability between prompting strategies, confidence calibration, and robustness under adversarial conditions. The accuracy rates of the 4 LLMs were evaluated through 2 independent attempts, with the results presented in Table 1. Overall, Gemini 2.0 Pro and Deep Seek V3 demonstrated significant advantages, with overall accuracy rates exceeding 83% in both attempts.

Longhui Xu, Xiao Cong, Renxiu Wang, Na Li, Xinru Liu, Ronghui Wang, Cuiping Xu

JMIR Med Inform 2025;13:e78279


Token Probabilities to Mitigate Large Language Models Overconfidence in Answering Medical Questions: Quantitative Study

Token Probabilities to Mitigate Large Language Models Overconfidence in Answering Medical Questions: Quantitative Study

True positive rate, true negative rate, as well as accuracy rates above and below optimal discrimination threshold were estimated with 95% CIs and compared using Mc Nemar tests.

Raphaël Bentegeac, Bastien Le Guellec, Grégory Kuchcinski, Philippe Amouyel, Aghiles Hamroun

J Med Internet Res 2025;27:e64348


Performance of Open-Source Large Language Models in Psychiatry: Usability Study Through Comparative Analysis of Non-English Records and English Translations

Performance of Open-Source Large Language Models in Psychiatry: Usability Study Through Comparative Analysis of Non-English Records and English Translations

Diagnostic accuracy was also evaluated by comparing the ground truth with diagnostic impressions provided by the model. Top-1 and top-2 diagnostic accuracy were calculated for both the Korean and English versions of the psychiatric notes. To further examine whether translation errors affected diagnostic performance, we divided the 200 translated notes into two groups based on translation quality.

Min-Gyu Kim, Gyubeom Hwang, Junhyuk Chang, Seheon Chang, Hyun Woong Roh, Rae Woong Park

J Med Internet Res 2025;27:e69857


Automatic Image Recognition Meal Reporting Among Young Adults: Randomized Controlled Trial

Automatic Image Recognition Meal Reporting Among Young Adults: Randomized Controlled Trial

While the existing design was shown to be positive in terms of accuracy and was generally well received by users, there were concerns regarding the accuracy and time-consuming nature of completing meal reporting for an entire meal. Furthermore, in authentic dietary intake scenarios, voice reporting during meal consumption was not always convenient. Consequently, we developed the latest version to enhance the existing design.

Prasan Kumar Sahoo, Sherry Yueh-Hsia Chiu, Yu-Sheng Lin, Chien-Hung Chen, Denisa Irianti, Hsin-Yun Chen, Mekhla Sarkar, Ying-Chieh Liu

JMIR Mhealth Uhealth 2025;13:e60070


Deep Learning Multi-Modal Melanoma Detection: Algorithm Development and Validation

Deep Learning Multi-Modal Melanoma Detection: Algorithm Development and Validation

The best-performing model was a combination of Res Net-50 and Inception V3, with an accuracy of 80%. Most of these approaches aim to optimize models through transfer learning and various preprocessing techniques in an attempt to increase accuracy.

Nithika Vivek, Karthik Ramesh

JMIR AI 2025;4:e66561


Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation

Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation

XGB achieved the highest overall performance, with an accuracy of 84.7% and an AUC-ROC score of 84.6%. Its F1-score of 84.0% and precision of 83.9% demonstrate its ability to consistently deliver high-accuracy predictions while minimizing false positives. The SVM achieved an accuracy of 73.0%, comparable to that of LR, but it demonstrated an improvement in the AUC-ROC score of 65.7%. Its F1-score of 67.1% reflects a slight enhancement in predictive balance.

Caroline Bönisch, Christian Schmidt, Dorothea Kesztyüs, Hans A Kestler, Tibor Kesztyüs

JMIR Med Inform 2025;13:e60204


Challenges in Implementing Artificial Intelligence in Breast Cancer Screening Programs: Systematic Review and Framework for Safe Adoption

Challenges in Implementing Artificial Intelligence in Breast Cancer Screening Programs: Systematic Review and Framework for Safe Adoption

Artificial intelligence (AI) presents a solution by automating and streamlining these processes, potentially augmenting both efficiency and accuracy. However, the adoption of AI in breast cancer screening is not without challenges. Although there are over 20 Food and Drug Administration (FDA)–approved AI applications for breast imaging, their adoption and utilization in clinical settings remain highly variable and generally low [6].

Serene Goh, Rachel Sze Jen Goh, Bryan Chong, Qin Xiang Ng, Gerald Choon Huat Koh, Kee Yuan Ngiam, Mikael Hartman

J Med Internet Res 2025;27:e62941