Psychiatry on Twitter: Content Analysis of the Use of Psychiatric Terms in French

doi:10.2196/18539

Viewpoint

¹Fédération Régionale de Recherche en Psychiatrie et santé mentale d’Occitanie, Toulouse, France

²Centre Hospitalier de Montauban, Montauban, France

³Institut de Recherche en Informatique de Toulouse, Université de Toulouse, Toulouse, France

Corresponding Author:

Sarah Delanys, MD

Fédération Régionale de Recherche en Psychiatrie et santé mentale d’Occitanie

134 route d'Espagne

Toulouse, 31000

France

Phone: 33 5 61 43 78 52

Email: s.delanys@ch-montauban.fr

Background: With the advent of digital technology and specifically user-generated contents in social media, new ways emerged for studying possible stigma of people in relation with mental health. Several pieces of work studied the discourse conveyed about psychiatric pathologies on Twitter considering mostly tweets in English and a limited number of psychiatric disorders terms. This paper proposes the first study to analyze the use of a wide range of psychiatric terms in tweets in French.

Objective: Our aim is to study how generic, nosographic, and therapeutic psychiatric terms are used on Twitter in French. More specifically, our study has 3 complementary goals: (1) to analyze the types of psychiatric word use (medical, misuse, or irrelevant), (2) to analyze the polarity conveyed in the tweets that use these terms (positive, negative, or neural), and (3) to compare the frequency of these terms to those observed in related work (mainly in English).

Methods: Our study was conducted on a corpus of tweets in French posted from January 1, 2016, to December 31, 2018, and collected using dedicated keywords. The corpus was manually annotated by clinical psychiatrists following a multilayer annotation scheme that includes the type of word use and the opinion orientation of the tweet. A qualitative analysis was performed to measure the reliability of the produced manual annotation, and then a quantitative analysis was performed considering mainly term frequency in each layer and exploring the interactions between them.

Results: One of the first results is a resource as an annotated dataset. The initial dataset is composed of 22,579 tweets in French containing at least one of the selected psychiatric terms. From this set, experts in psychiatry randomly annotated 3040 tweets that corresponded to the resource resulting from our work. The second result is the analysis of the annotations showing that terms are misused in 45.33% (1378/3040) of the tweets and that their associated polarity is negative in 86.21% (1188/1378) of the cases. When considering the 3 types of term use, 52.14% (1585/3040) of the tweets are associated with a negative polarity. Misused terms related to psychotic disorders (721/1300, 55.46%) were more frequent to those related to depression (15/280, 5.4%).

Conclusions: Some psychiatric terms are misused in the corpora we studied, which is consistent with the results reported in related work in other languages. Thanks to the great diversity of studied terms, this work highlighted a disparity in the representations and ways of using psychiatric terms. Moreover, our study is important to help psychiatrists to be aware of the term use in new communication media such as social networks that are widely used. This study has the huge advantage to be reproducible thanks to the framework and guidelines we produced so that the study could be renewed in order to analyze the evolution of term usage. While the newly build dataset is a valuable resource for other analytical studies, it could also serve to train machine learning algorithms to automatically identify stigma in social media.

JMIR Form Res 2022;6(2):e18539

doi:10.2196/18539

Keywords

social media analysis (24); psychiatric term use (1); social stigma (22); Twitter (458); social media (1969); mental health (2103)

Stigma of Mental Disorders

Mental health stigma finds its roots in the history of psychiatry, in its connection to madness representations. Throughout history, the mentally ill patient has been given a pejorative label that induces social rejection. The term “stigma” comes from the ancient Greek “stitzein,” which means “to tattoo or mark with a red iron.” Jean-Yves Giordana [Giordana J. La Stigmatisation en Psychiatrie et en Santé Mentale. Masson: Elsevier; 2010.1], psychiatrist at the Nice hospital, rightly defines stigmatization as “a general attitude, a prejudicial one induced by low knowledge or ignorance of a situation or a state.”

Stigma and discriminatory behaviors have multiple negative impacts. Stigma in mental health leads the individual away from society, which often causes social isolation. Indeed, stigmatized people confront difficulties in their daily life such as integration into the professional world [Giordana J. La Stigmatisation en Psychiatrie et en Santé Mentale. Masson: Elsevier; 2010.1,Crisp AH, Gelder MG, Rix S, Meltzer HI, Rowlands OJ. Stigmatisation of people with mental illnesses. Br J Psychiatry 2000 Jul;177:4-7. [CrossRef] [Medline]2], access to housing [Giordana J. La Stigmatisation en Psychiatrie et en Santé Mentale. Masson: Elsevier; 2010.1], and interpersonal relationships [Lampropoulos D, Fonte D, Apostolidis T. La stigmatisation sociale des personnes vivant avec la schizophrénie: une revue systématique de la littérature. L'Évolution Psychiatrique 2019 Apr;84(2):346-363. [CrossRef]3]. Difficulties also concern the treatment itself, including delay in initial medical consultation, difficulty in accepting the illness, and tenuous therapeutic alliance, etc [Giordana J. La Stigmatisation en Psychiatrie et en Santé Mentale. Masson: Elsevier; 2010.1].

Many studies analyzing newspaper articles point out a major diversion in the use of psychiatric terms [Athanasopoulou C, Välimäki M. 'Schizophrenia' as a metaphor in greek newspaper websites. Stud Health Technol Inform 2014;202:275-278. [Medline]4,Magliano L, Read J, Marassi R. Metaphoric and non-metaphoric use of the term "schizophrenia" in Italian newspapers. Soc Psychiatry Psychiatr Epidemiol 2011 Oct 18;46(10):1019-1025. [CrossRef] [Medline]5]. A French survey conducted by the L’Observatoire Société et Consommation [L’image de la schizophrénie à travers son traitement médiatique. L’OBSoCo. 2015. URL: https://tinyurl.com/yckujj37 [accessed 2022-01-28] 6] found that the French terms “schizophrène” (schizophrenic) and “schizophrénie” (schizophrenia) are particularly used in the context of violent news items and are often used metaphorically, which may lead to dangerousness, contradictory behavior, or negative connotation. Pignon et al [Pignon B, Tebeka S, Leboyer M, Geoffroy P. De « Psychose maniaco-dépressive » à « Troubles bipolaires » : une histoire des représentations sociales et de la stigmatisation en rapport avec la nosographie. Annales Médico-psychologiques, revue psychiatrique 2017 Jul;175(6):514-521. [CrossRef]7], on the other hand, focused on the impact of the change of the terminology of bipolar disorder. The authors observed that substituting the term “manic-depressive psychosis” for the term “bipolar disorder” reduces stigma by disassociating this disorder from the representation of madness and dangerousness leading to the social exclusion classically associated with psychotic disorders.

With the rise of the internet and social media, it has become important to analyze how psychiatric terms are used by people in general to act effectively against stigmatization. Indeed, from the internet users’ point of view, Berry et al [Berry N, Lobban F, Belousov M, Emsley R, Nenadic G, Bucci S. #WhyWeTweetMH: understanding why people use Twitter to discuss mental health problems. J Med Internet Res 2017 Apr 05;19(4):e107 [FREE Full text] [CrossRef] [Medline]8] showed that tweeting about mental health helps reduce isolation, fight stigmatization, and raise awareness of mental health by improving knowledge, promoting free expression, and strengthening coping and empowerment strategies.

In this paper, we focus on tweets in French as Twitter is one of the most used social media platforms in France [Les 50 chiffres à connaître sur les médias sociaux en 2019. URL: https://www.blogdumoderateur.com/50-chiffres-medias-sociaux-2019/ [accessed 2022-01-27] 9], and the tweets are publicly available with some conditions.

Twitter as a Resource to Analyze the Usage of Psychiatric Terms

More than 500 million tweets are posted daily in more than 40 languages [Twitter. Wikipédia. 2019 Oct 31. URL: https://fr.wikipedia.org/w/index.php?title=Twitter&oldid=163661816 [accessed 2020-02-01] 10]. In March 2019, Twitter had 321 million active users worldwide (at least one use per month) among which 10.3 million were in France [Tauzin A. Classement des réseaux sociaux en France et dans le monde en 2019. Agence Tiz. 2019. URL: https://www.tiz.fr/utilisateurs-reseaux-sociaux-france-monde/ [accessed 2022-01-27] 11], making Twitter third in popularity behind Facebook and YouTube, the other two most popular social networks, with 35 million and 19 million active users, respectively. The sociodemographic profile of Twitter users in France is more male, younger, and more educated than the general population. They are mainly students, with some managers and intellectual professions [Infographie #WhoUsesTwitter. GlobalWebIndex. 2014. URL: https://guillaume-dardier.fr/utilisateurs-twitter-france.html [accessed 2020-02-01] 12-Depla MFIA, de Graaf R, van Weeghel J, Heeren TJ. The role of stigma in the quality of life of older adults with severe mental illness. Int J Geriatr Psychiatry 2005 Feb;20(2):146-153. [CrossRef] [Medline]14].

Twitter offers its users the opportunity to post short messages named “tweets” (140 characters maximum in our study although since we collected the data, the maximum length has doubled), making possible analyses of a large number of tweets in a short time. In addition, Twitter provides 1% of tweets posted each day worldwide, allowing free access to a large database accessible for various purposes including research.

Since 2014, many studies have addressed discourse content about psychiatry on Twitter, suggesting that social networks convey stigmatizing representations of mental health and people with mental health conditions. To our knowledge, existing studies deal only with English and Greek languages. Moreover, they focus on a limited number of psychiatric disorder terms such as depression, schizophrenia, and autism. Lachmar et al [Lachmar EM, Wittenborn AK, Bogen KW, McCauley HL. #MyDepressionLooksLike: examining public discourse about depression on twitter. JMIR Ment Health 2017 Oct 18;4(4):e43 [FREE Full text] [CrossRef] [Medline]15] created the hashtag #MDLL (#mydepressionlookslike) and analyzed 3225 tweets highlighting 7 topics when Twitter users talk about depression: dysfunctional thoughts, impact on daily life, social difficulties, hiding behind a mask, sadness and apathy, suicidal behaviors/ideas, and seeking support/help. Reavley et al [Reavley NJ, Pilkington PD. Use of Twitter to monitor attitudes toward depression and schizophrenia: an exploratory study. PeerJ 2014;2:e647 [FREE Full text] [CrossRef] [Medline]16] analyzed a corpus of tweets about schizophrenia and depression in English. This corpus was collected from the 1% database of Twitter using two keywords: #schizophrenia and #depression. They found that 5% of tweets related to schizophrenia convey stigmatizing remarks while less than 1% are related to depression. In addition, in their dataset, they found the polarity is mostly positive (65% of the tweets analyzed) when writing about depression while it is rather neutral (43%) for schizophrenia. Joseph et al [Joseph AJ, Tandon N, Yang LH, Duckworth K, Torous J, Seidman LJ, et al. #Schizophrenia: use and misuse on Twitter. Schizophr Res 2015 Jul;165(2-3):111-115. [CrossRef] [Medline]17] found that tweets containing the hashtag #schizophrenia convey a negative sentiment more frequently than tweets containing #diabetes (21% vs 12.6%, respectively). Similarly, Athanasopoulou et al [Athanasopoulou C, Sakellari E. 'Schizophrenia' on Twitter: content analysis of greek language tweets. Stud Health Technol Inform 2016;226:271-274. [Medline]18] showed that tweets about schizophrenia in Greek tend to be more negative, medically inappropriate, sarcastic, and used in a nonmedical way than tweets about diabetes. Robinson et al [Robinson J, Bailey E, Hetrick S, Paix S, O'Donnell M, Cox G, et al. Developing social media-based suicide prevention messages in partnership with young people: exploratory study. JMIR Ment Health 2017 Oct 04;4(4):e40 [FREE Full text] [CrossRef] [Medline]19] analyzed and compared messages about 5 psychiatric disorders (autism, depression, eating disorders, obsessive-compulsive disorder, and schizophrenia) and 5 physical diseases (AIDS, asthma, cancer, diabetes, and epilepsy). In their corpus, schizophrenia and HIV were the most stigmatized diseases. These diseases are perceived as dangerous and with an uncontrollable and unpredictable nature. The authors found more than 40% of stigmatizing tweets were about schizophrenia compared to less than 5% of those about depression. Finally, Alvarez-Mon et al [Alvarez-Mon MA, Llavero-Valero M, Sánchez-Bayona R, Pereira-Sanchez V, Vallejo-Valdivielso M, Monserrat J, et al. Areas of interest and stigmatic attitudes of the general public in five relevant medical conditions: thematic and quantitative analysis using Twitter. J Med Internet Res 2019 May 28;21(5):e14110 [FREE Full text] [CrossRef] [Medline]20] recently studied the use of the term “psychosis” and compared it to some medical terms from the field of somatic medicine (diabetes, HIV, Alzheimer disease, and breast cancer). The results showed a predominance of nonmedical content (33.3%) with a high frequency of misuse and pejorative opinion tone (36.2%) in the tweets related to psychosis compared to the tweets related to the physical diseases studied.

Toward the First Analysis of Psychiatric Terms in French Tweets

As far as we know, this is the first study that proposes an in-depth analysis of psychiatric term usage in tweets in French. In particular, we propose the following:

Analysis of a wide range of psychiatric terms going beyond a small set of nosographic terms. Our study considers a wide range of nosographic terms but also generic and therapeutic psychiatric terms.
Multilayer annotation scheme that includes the type of word use (medical usage, misuse, or irrelevant usage) and the opinion orientation of the tweet (positive, negative, neutral, or mixed).
New dataset of about 22,579 tweets containing the selected terms among which 3040 are manually annotated by clinical psychiatrists. The dataset will be made available to the research community.
Qualitative analysis of the annotated data in terms of interannotator agreement along with quantitative analysis considering mainly term frequency in each layer and exploring the interactions between them.
Comparison of our results to those obtained by analyzing tweets in English. Our results constitute a first important step toward an automatic detection of stigma in social media.

Objectives

The multidisciplinary study reported in this paper has been conducted by clinical psychiatrists and computer scientist experts in natural language processing and information extraction from social media. The main objective of the study is to analyze how psychiatric terms are used on Twitter, in particular whether they are used in a medical use. The other goal is to analyze the opinion polarity of these terms and thus highlight the main stereotypes they convey. Our assumption is that psychiatric terms are often misused and these misusages probably have a negative polarity.

Tweet Collection

Our corpus is new and composed of tweets in French that contain selected terms relative to psychiatry. To guarantee a wide lexical convergence of the extracted tweets, we grouped terms according to 3 dimensions:

Generic terms indicating different morphological variations of the French stem “psychiatr” (psychiatr) such as “psychiatrie” (psychiatry), “psychiatrique” (psychiatric) and “psychiatre” (psychiatrist)
Nosographic terms relative to psychiatric disorders. Following the Diagnostic and Statistical Manual of Mental Disorder taxonomy [DSM-5 : Manuel diagnostique et statistique des troubles mentaux. Masson: Elsevier; 2015.21] that classifies mental disorders in order to improve diagnoses, treatment, and research, we grouped terms in 5 main categories: schizophrenia spectrum and other psychotic disorders, bipolar and depressive disorders, autism spectrum disorder, anxiety disorders, and other disorders.
Therapeutic terms relative to the most used drugs in the psychiatry field.

In each dimension, we selected a set of representative terms experts considered as the most important for this study. For each term, we also considered its slang versions, such as schizo for schizophrenia. We selected a total of 120 psychiatric terms (see Table 1 for examples and frequencies and

Multimedia Appendix 1

List of terms used in our study grouped in three dimensions: generic, diagnostic and therapeutic.

DOCX File , 15 KB Multimedia Appendix 1 for the detailed list).

Our dataset is composed of tweets collected via the OSIRIM platform that hosts a Twitter stream representing the 1% of global tweets since 2015, with a total of 73,345,245 tweets. From this collection, we selected tweets in French—using the tag provided by Twitter on tweets—that were posted from January 1, 2016, to December 31, 2018, and contain at least one psychiatric term from our list. After removing retweets and duplicates, we got at a total of 22,579 tweets (Table 2).

Table 1. Examples of terms for each dimension along with their frequencies and English translation (n=120).

Psychiatric terms			Example terms		terms, n
Generic			Psychiat-*		1
Diagnostic					31
	Schizophrenia spectrum	Psychose (psychosis), Psychotique (psychotic), Schizophrène (schizophrenic), Schizo (schizo)		6
	Bipolar and depressive disorders	Maniaque (manic), Bipolaire (bipolar), Hypomaniaque (hypomanic)		7
	Autism spectrum	Autisme (autism), Autiste (autistic)		2
	Anxiety disorders	Phobie (phobia), TOC^a (obsessive compulsive disorder)		6
	Other disorders	Hyperactif (hyperactive), borderline		10
Therapeutic			Neuroleptique (neuroleptic), Xanax (alprazolam), Theralite (lithium)		88

^aTOC: trouble obsessionnel compulsif.

Table 2. Number of tweets containing the selected terms (a tweet may contain several keywords).

Psychiatric terms		tweets, n
Generic		6993
Diagnostic		12,149
	Schizophrenia spectrum	1304
	Bipolar and depressive disorders	3500
	Autism spectrum	4389
	Anxiety disorders	5855
	Other disorders	101
Therapeutic		1853

Annotation Guidelines

We designed an annotation guideline to analyze the use of the 120 selected psychiatric terms in tweets in French. To this end, two clinical psychiatrists first analyzed a small subset of 157 tweets randomly selected in order to define the annotation guidelines. These tweets were then removed from the initial collection and never used again in the study.

Our annotation scheme is multilayered and aims at answering 2 main questions: Do the psychiatric terms used in the tweet convey a medical use or not? What is the overall opinion given in the tweet? We detail each layer and illustrate them by example tweets extracted from our corpus. In these examples, psychiatric terms are in bold font. All examples are given in French along with their English translation. Note that translations may not perfectly reflect the initial writing, as tweets often use slang, abbreviations, and contain grammatical errors.

Types of Term Use

We define three possible types of term use: medical use, misuse or irrelevant use.

Medical use corresponds to the medical definition of the term. The term is used to refer to a medical pathology or to the domain of psychiatry, as in Textbox 1.

Misuse occurs when a psychiatric term is used in a figurative or metaphoric way. These misuses often convey prejudices, stereotypes, or humor and thus make psychiatry commonplace and strengthen the stigma of psychiatry and people suffering from psychiatric disorders, as in Textbox 2.

Irrelevant use occurs when the tweet is not understandable (lack of context, link to a URL, advertising, etc) or not relevant to psychiatry (use of synonyms), as in Textbox 3.

Examples of medical uses (psychiatric terms are in bold font).

Tellement dégueulasse le valium en gouttes (Oral valium is so disgusting)
Tout à l'heure g écouter une vidéo des voix qu'les schizo entendent dans leurs têtes g pas pu tenir + de 30sec g cru devenir folle (I listened to a video of voices heard by schizophrenic people I couldn’t hold more than 30 sec I thought I was going insane)

Textbox 1. Examples of medical uses (psychiatric terms are in bold font).

Examples of misuse (psychiatric terms are in bold font).

Là j'suis en colère tu changes toutes les minutes, à croire que t'es bipolaire. (I’m angry you’re changing your mind every minute, I’d think you’re bipolar)
Tu viens d faire quoi sale autiste (What have you just done, you f*** autistic)

Textbox 2. Examples of misuse (psychiatric terms are in bold font).

Examples of irrelevant use (psychiatric terms are in bold font).

qd t une schizo https://t.co/SB3Z1DR7cX (when a schizophrenic https://t.co/SB3Z1DR7cX)
Psychose, C'est un peu vieux mais c'est trop cool (Psycho, it’s a little bit old but it’s so cool)

Textbox 3. Examples of irrelevant use (psychiatric terms are in bold font).

Overall Opinion of the Tweet

As usually defined in sentiment analysis [Benamara F, Taboada M, Mathieu Y. Evaluative language beyond bags of words: linguistic insights and computational applications. Computational Linguistics 2017 Apr;43(1):201-264. [CrossRef]22], polarity or orientation indicates whether the opinion is positive, negative, or neutral. We consider these 3 possible values and also include mixed opinion to account for cases where the opinion can be positive and negative at the same time. We consider opinion orientation of the author at the tweet level regardless of whether the expressed opinion is related to a psychiatric term.

A tweet is annotated as having positive polarity when the writer expresses a positive personal opinion on facts, events, or on a quote (1); the general idea of the tweet is in favor of psychiatry (2); the writer defends the proper medical use of psychiatric terms regardless of their valence (3); or with the presence of positive terms or smileys (4), as in Textbox 4.

A tweet has a negative polarity when the writer expresses a negative personal opinion on facts, events, or on a quote (1); with the presence of terms that are basically negative (2); the tweet includes ironic or sarcastic comments (3); the tweet reports negative facts connected to psychiatry (4); the tweet contains a positive smiley linked to a negative content (5); the tweet marks a derogatory or insulting positioning (6); or the psychiatric term is used in the tweet to refer to an inconvenient situation or to a topic releasing a negative emotion (7), as in Textbox 5.

Mixed or neutral polarity orientation mainly covers cases where the opinion of the writer is not clearly expressed (1) or the writer’s opinion is mixed, both positive and negative (2), as in Textbox 6.

See

Multimedia Appendix 2

Additional tables.

DOCX File , 50 KB Multimedia Appendix 2 for other examples of types of term use and polarity orientation.

Examples of positive polarity (psychiatric terms are in bold font).

C'est trop top la psychiatrie tu vas t'éclater! (Psychiatry is so great, you’ll have so much fun!)
Mon Rdv psychiatrie de demain tombe à la perfection. Pour une fois je l'avoue, j'en ai grandement besoin. (Tomorrow is the perfect timing for my psychiatric appointment. To be honest, for once, I really need it)
Bipolaire c'est un vrai trouble psychiatrique, mesdames arrêtez de le mettre en TN vous n'êtes pas bipolaires vous êtes juste mal éduquées. (Bipolar disorder is a real mental health condition. Ladies, stop using this term as tweet name. You are not bipolar, you are just poorly-educated)
ça va mieux t'inquiète pas merci, j'ai pris 3 Xanax et ils commencent à faire effet (Feeling better, thanks, don’t worry. I took 3 Xanax tablets and it has started to work)

Textbox 4. Examples of positive polarity (psychiatric terms are in bold font).

Examples of negative uses (psychiatric terms are in bold font).

La psychiatrie ça brise encore plus les gens. (Psychiatry breaks people down even more)
Il vend sa mère au diable se marie avec une chetana et Il finit en psychiatrie. Le pacte 666 l'a détruit. (He sells his mother to the devil, he gets married to a she-devil and he ends up in psychiatric hospital. He has been wiped out by pact 666)
La France est une terre d'asile... psychiatrique ! (France is a land of asylum… psychiatric asylum!)
Paris: la psychiatre vendait de faux certificats médicaux aux envahisseurs sans-papiers (Paris: psychiatry used to sell fake medical certificates to paperless invaders)
Les artistes finissent presque tous en hôpital psychiatrique / (Almost all artists end up in psychiatric hospital )
Selon une grosse conne psychiatrique le harcèlement d'activité est une loi de France (According to a dumb lunatic woman, harassing is a custom in France)
Et franchement les garçons radins c'est grave ma phobie (Sincerely, stingy boys are basically my greatest phobia

Textbox 5. Examples of negative uses (psychiatric terms are in bold font).

Examples of mixed or neutral use (psychiatric terms are in bold font).

Croyez-vous qu'un psychiatre prendrait les médicaments qu'il prescrit (Do you think a psychiatrist would take the medicine he prescribes?)
La psychiatrie c'est cool, Faire ça dans un lieu de stage où ils te harcèlent jusqu'à la dernière heure de tout tout ton stage par contre moins. (Psychiatry is fun but throughout the internship they badger you, is less fun)

Textbox 6. Examples of mixed or neutral use (psychiatric terms are in bold font).

Annotation Procedure and Ethics

Our data were manually annotated by two French native speakers, both clinical psychiatrists, using the Brat tool. We performed a 3-step annotation where an intermediate analysis of agreement and disagreement between annotators was completed. Annotators were first trained on 157 tweets that helped them better understand the task and adjust the annotation guidelines. Annotators were then asked to separately annotate 319 tweets (around 10% of the annotated corpus) so that interannotator agreements could be computed. Before moving to the real annotation, annotators were asked to discuss main cases of disagreement, which resulted in a set of 269 tweets. After adjudication, a total of 2771 tweets were manually annotated by one expert. In the end, our dataset is composed of 3040 tweets (269 + 2771) annotated according to our multilevel annotation scheme (Figure 1).

Regarding ethics, we did not request validation with the research ethics board since this study does not involve patients and does not use personal digital data. In addition, our data are composed of textual content from the public domain. Finally, as we will make the dataset publicly available to the research community and conform to the Twitter Developer Agreement and Policy that allows unlimited distribution of the numeric identification number of each tweet.

Interannotator Agreement

Interannotator agreement allows assessment of the amount of agreement between annotators beyond chance and provides a measure of the reliability to the annotation guide. We used the Cohen kappa statistical measure defined as follows [Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 2016 Jul 02;20(1):37-46. [CrossRef]23]:

Where p_o and p_e are probabilities that correspond to the observed and the expected agreements, respectively. The latter probability measures the possible agreement by chance when each annotator randomly selects a given category. Kappa measure ranges from −1 to +1 where K ≤ 0 indicates no agreement, 0.6 ≤ K ≤ 0.8 a high agreement, and K = 1 a perfect agreement. We used Microsoft Excel to compute Cohen kappa from the contingency table of frequencies with the rows and columns indicating the categories (agreement frequencies are in the diagonal cells whereas disagreements are in the off-diagonal cells).