TY - JOUR AU - Joranger, Pål AU - Rivenes Lafontan, Sara AU - Brevik, Asgeir PY - 2025 DA - 2025/7/3 TI - Evaluating a Large Language Model’s Ability to Synthesize a Health Science Master’s Thesis: Case Study JO - JMIR Form Res SP - e73248 VL - 9 KW - master’s thesis KW - large language model KW - LLM KW - ChatGPT KW - health science KW - qualitative KW - quantitative AB - Background: Large language models (LLMs) can aid students in mastering a new topic fast, but for the educational institutions responsible for assessing and grading the academic level of students, it can be difficult to discern whether a text has originated from a student’s own cognition or has been synthesized by an LLM. Universities have traditionally relied on a submitted written thesis as proof of higher-level learning, on which to grant grades and diplomas. But what happens when LLMs are able to mimic the academic writing of subject matter experts? This is now a real dilemma. The ubiquitous availability of LLMs challenges trust in the master’s thesis as evidence of subject matter comprehension and academic competencies. Objective: In this study, we aimed to assess the quality of rapid machine-generated papers against the standards of the health science master’s program we are currently affiliated with. Methods: In an exploratory case study, we used ChatGPT (OpenAI) to generate 2 research papers as conceivable student submissions for master’s thesis graduation from a health science master’s program. One paper simulated a qualitative health science research project and another simulated a quantitative health science research project. Results: Using a stepwise approach, we prompted ChatGPT to (1) synthesize 2 credible datasets, and (2) generate 2 papers, that—in our judgment—would have been able to pass as credible medium-quality graduation research papers at the health science master’s program the authors are currently affiliated with. It took 2.5 hours of iterative dialogue with ChatGPT to develop the qualitative paper and 3.5 hours to develop the quantitative paper. Making the synthetic datasets that served as a starting point for our ChatGPT-driven paper development took 1.5 and 16 hours for the qualitative and quantitative datasets, respectively. This included learning and prompt optimization, and for the quantitative dataset, it included the time it took to create tables, estimate relevant bivariate correlation coefficients, and prepare these coefficients to be read by ChatGPT. Conclusions: Our demonstration highlights the ease with which an LLM can synthesize research data, conduct scientific analyses, and produce credible research papers required for graduation from a master’s program. A clear and well-written master’s thesis, citing subject matter authorities and true to the expectations for academic writing, can no longer be regarded as solid proof of either extensive study or subject matter mastery. To uphold the integrity of academic standards and the value of university diplomas, we recommend that master’s programs prioritize oral examinations and school exams. This shift is now crucial to ensure a fair and rigorous assessment of higher-order learning and abilities at the master’s level. SN - 2561-326X UR - https://formative.jmir.org/2025/1/e73248 UR - https://doi.org/10.2196/73248 DO - 10.2196/73248 ID - info:doi/10.2196/73248 ER -