Search Articles

View query in Help articles search

Search Results (1 to 9 of 9 Results)

Download search results: CSV END BibTex RIS


Enhancing Bidirectional Encoder Representations From Transformers (BERT) With Frame Semantics to Extract Clinically Relevant Information From German Mammography Reports: Algorithm Development and Validation

Enhancing Bidirectional Encoder Representations From Transformers (BERT) With Frame Semantics to Extract Clinically Relevant Information From German Mammography Reports: Algorithm Development and Validation

The initialization phase resulted in an initial fact schema and corresponding annotation guideline, documented in a web-based, versioned collaboration platform (Confluence, Atlassian). For this phase, the annotation guideline development dataset was used. The goal of the next phase, quality improvement, was to iteratively improve the fact schema and systematically revise the annotation guideline.

Daniel Reichenpfader, Jonas Knupp, Sandro Urs von Däniken, Roberto Gaio, Fabio Dennstädt, Grazia Maria Cereghetti, André Sander, Hans Hiltbrunner, Knud Nairz, Kerstin Denecke

J Med Internet Res 2025;27:e68427

Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study

Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study

Our goal is to evaluate whether relieving the emphasis on entity boundary improves annotation speed while maintaining the overall quality of the produced labels. Thus, we compared the traditional (boundary-strict) annotation method against 2 proposed boundary-free approaches: lenient span and point annotation. Figure 1 presents a comparative example of each annotation method. Traditional annotation requires precise annotation of each NE’s exact start and end positions.

Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki

JMIR Med Inform 2024;12:e59680

Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study

Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study

Indeed, one study of NER annotation speed found it can take between 10 and 30 seconds per sentence for experts to annotate named entities [8]. The gold-standard annotated Bio Semantics corpus is composed of 163,219 sentences, which implies an optimal annotation time of over 11 weeks at 40 hours per week (453.39 h) [20].

Zoltan P Majdik, S Scott Graham, Jade C Shiva Edward, Sabrina N Rodriguez, Martha S Karnes, Jared T Jensen, Joshua B Barbour, Justin F Rousseau

JMIR AI 2024;3:e52095

Using Large Language Models to Support Content Analysis: A Case Study of ChatGPT for Adverse Event Detection

Using Large Language Models to Support Content Analysis: A Case Study of ChatGPT for Adverse Event Detection

Biomedical text analysis is commonly burdened by the need for manual data review and annotation, which is costly and time-consuming. Artificial intelligence (AI) tools, including large language models (LLMs) such as Chat GPT (Open AI) [1], could reduce this burden by allowing scientists to leverage vast amounts of text data (including medical records and public data) with short written prompts as annotation instructions [2].

Eric C Leas, John W Ayers, Nimit Desai, Mark Dredze, Michael Hogarth, Davey M Smith

J Med Internet Res 2024;26:e52499

Developing a Framework to Infer Opioid Use Disorder Severity From Clinical Notes to Inform Natural Language Processing Methods: Characterization Study

Developing a Framework to Infer Opioid Use Disorder Severity From Clinical Notes to Inform Natural Language Processing Methods: Characterization Study

Although prior studies have used NLP to identify problematic opioid use from EHRs [21-26], few have described an annotation process and none have reported documentation patterns for OUD-relevant information within clinical notes.

Melissa N Poulsen, Philip J Freda, Vanessa Troiani, Danielle L Mowery

JMIR Ment Health 2024;11:e53366

Development of a Corpus Annotated With Mentions of Pain in Mental Health Records: Natural Language Processing Approach

Development of a Corpus Annotated With Mentions of Pain in Mental Health Records: Natural Language Processing Approach

This was used to initiate the development of annotation guidelines. These guidelines were drafted to ensure consistent annotation by multiple annotators. Upon extraction, these documents were preannotated with pain terms (labeled as a mention of pain, as seen in Table 3) from the lexicon and loaded into an annotation tool, Med CAT [19], for manual verification and annotation of these mentions of pain.

Jaya Chaturvedi, Natalia Chance, Luwaiza Mirza, Veshalee Vernugopan, Sumithra Velupillai, Robert Stewart, Angus Roberts

JMIR Form Res 2023;7:e45849

Development of a COVID-19–Related Anti-Asian Tweet Data Set: Quantitative Study

Development of a COVID-19–Related Anti-Asian Tweet Data Set: Quantitative Study

In the next sections, we will explain the details of data collection, data annotation, and content analytics. We will also present and compare the results of the experiment with the ML models, which are designed to detect stigmatizing language against people of Asian descent. As already discussed, creating such a data set is challenging for various reasons. We devised an iterative method for selecting a final set of tweets that we manually annotated.

Maryam Mokhberi, Ahana Biswas, Zarif Masud, Roula Kteily-Hawa, Abby Goldstein, Joseph Roy Gillis, Shebuti Rayana, Syed Ishtiaque Ahmed

JMIR Form Res 2023;7:e40403

Suicide Risk and Protective Factors in Online Support Forum Posts: Annotation Scheme Development and Validation Study

Suicide Risk and Protective Factors in Online Support Forum Posts: Annotation Scheme Development and Validation Study

Our work provides a formalized approach for developing annotation data for suicide and social media data. We have discussed the implications of this research as they relate to the development of rigorous and validated frameworks for assessing mental health on the web.

Stevie Chancellor, Steven A Sumner, Corinne David-Ferdon, Tahirah Ahmad, Munmun De Choudhury

JMIR Ment Health 2021;8(11):e24471

TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations

TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations

Similarly, to the annotation of the ADE corpus, the Arizona disease corpus (AZDC) annotation guidelines [41] focused on the annotation of the diseases, also covering syndromes, illnesses, and disorders.

Nestor Alvaro, Yusuke Miyao, Nigel Collier

JMIR Public Health Surveill 2017;3(2):e24