Accepted for/Published in: JMIR Medical Education

Date Submitted:

Open Peer Review Period: -

Date Accepted:

Date Submitted to PubMed:

closed for review but you can still tweet
  • Rongxi P, Weiru W, Xiaoli Z, Zhong Y, Youcai D
  • Evaluating Large Language Models' Capabilities in Laboratory Hematology: A Case Study Based on the Chinese Medical Laboratory Technician Examination
  • JMIR Medical Education
  • DOI: 10.2196/11848
  • PMID: 30303485
  • PMCID: 6352016

Evaluating Large Language Models' Capabilities in Laboratory Hematology: A Case Study Based on the Chinese Medical Laboratory Technician Examination

Abstract

background

Large language models (LLMs) have shown considerable promise in the medical field, yet their application in specialized areas such as laboratory hematology remains underexplored.

objective

This study aims to evaluate the performance of two prominent LLMs, GPT-4o and Kimi, in laboratory hematology and explore their potential applications in Chinese medical education.

methods

We selected 400 laboratory hematology questions from the Chinese Medical Laboratory Technician Examination (2015-2022), encompassing four subjects: basic knowledge, related professional knowledge, professional knowledge, and professional practice ability. GPT-4o and Kimi were tested using these questions combined with appropriate prompts, with each question administered twice independently. The accuracy and consistency of the models' responses were assessed by comparing them to standard answers, followed by statistical analysis.

results

GPT-4o and Kimi achieved overall accuracy rates of 87.9% and 72.8%, respectively, with response consistencies of 93.0% and 83.5%. Both models demonstrated relatively weaker performance in the professional knowledge subject and in specific areas such as erythrocyte disorders and normal hematopoiesis. GPT-4o consistently outperformed Kimi across all evaluated aspects.

conclusions

LLMs exhibit strong performance in laboratory hematology, despite certain limitations. These findings provide empirical evidence supporting the potential application of LLMs in Chinese medical education and highlight areas for future optimization and research.

clinicaltrial

As per the author’s request the PDF is not available.