This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
Usability tests can be either formative (where the aim is to detect usability problems) or summative (where the aim is to benchmark usability). There are ample formative methods that consider user characteristics and contexts (ie, cognitive walkthroughs, interviews, and verbal protocols). This is especially valuable for eHealth applications, as health conditions can influence user-system interactions. However, most summative usability tests do not consider eHealth-specific factors that could potentially affect the usability of a system. One of the reasons for this is the lack of fine-grained frameworks or models of usability factors that are unique to the eHealth domain.
In this study, we aim to develop an ontology of usability problems, specifically for eHealth applications, with patients as primary end users.
We analyzed 8 data sets containing the results of 8 formative usability tests for eHealth applications. These data sets contained 400 usability problems that could be used for analysis. Both inductive and deductive coding were used to create an ontology from 6 data sets, and 2 data sets were used to validate the framework by assessing the intercoder agreement.
We identified 8 main categories of usability factors, including basic system performance, task-technology fit, accessibility, interface design, navigation and structure, information and terminology, guidance and support, and satisfaction. These 8 categories contained a total of 21 factors: 14 general usability factors and 7 eHealth-specific factors. Cohen κ was calculated for 2 data sets on both the category and factor levels, and all Cohen κ values were between 0.62 and 0.67, which is acceptable. Descriptive analysis revealed that approximately 69.5% (278/400) of the usability problems can be considered as general usability factors and 30.5% (122/400) as eHealth-specific usability factors.
Our ontology provides a detailed overview of the usability factors for eHealth applications. Current usability benchmarking instruments include only a subset of the factors that emerged from our study and are therefore not fully suited for summative evaluations of eHealth applications. Our findings support the development of new usability benchmarking tools for the eHealth domain.
Usability tests of eHealth applications can be either formative (where the aim is to detect usability problems) or summative (where the aim is to benchmark usability). Formative usability tests use qualitative methods,
It has been argued that usability should be considered from the perspective of the system domain [
The problems with current usability benchmarking tools for the eHealth context stem from a general lack of understanding of usability within the eHealth context [
By analyzing multiple data sets of usability problems found in contemporary eHealth applications, we propose a conceptualization of usability for the eHealth domain from the patient’s perspective. An overview of eHealth-specific usability factors helps usability practitioners to link usability problems to an overarching classification that is tailored to the specific medical context in which these applications are embedded.
Data sets of usability tests were collected to conduct a content analysis of usability problems found in eHealth usability tests.
We analyzed 8 data sets from different usability tests conducted at institutions affiliated with the researchers. The data sets were strategically chosen to reflect a wide range of eHealth applications with different end-user groups, devices, and health goals. A data set was included if the eHealth application was recently developed; usability problems were elicited via at least one qualitative data collection method (eg, thinking aloud, interviews, and observations); and the participants of the usability tests consisted of patients.
The following eHealth applications were included in this study: (1) Stranded, a web-based gamification application in which users can progress in the game by regularly performing physiotherapeutic exercises that are scheduled by a physiotherapist [
The data sets had a total of 486 usability problems. We excluded usability problems that had unclear formulation, were duplicated, or were unrelated to usability (eg, user experience and motivation). For example, the problem
A total of 86 usability problems were eliminated from the data set, resulting in 400 usability issues that were suitable for the analyses. Each usability problem was assigned to a severity category. Most data sets included severity ratings based on the severity index of Duh et al [
Overview of data sets (N=8).
Data set | eHealth application | Description of app | Main health goal | Device platform | Target end-user group | Participants, n | Evaluation method | Length of session (minutes) | Usability problems, n |
1 | Stranded | Web-based gamified app | Offers fall prevention training via video instructions in a gamified environment | Computer | Prefraila and frail older adults (aged ≥65 years) | 19 | Concurrent think aloud and screen capture recordings | 60 | 66 |
2 | N/Ab | Web-based screening module | Identifies frailty levels among older adults and supports physical exercising | Tablet and social robot | Prefrail and frail older adults (aged ≥70 years) | 20 | Video observation | 50 | 64 |
3 | cVitals | Home-monitoring tool | Allows self-management of health by providing and supporting health measurements at home | Smartphone | Patients with heart failure or COPDc (aged ≥65 years) | 10 | Concurrent think aloud and observations | 60 | 39 |
4 | Council of Coaches | Web-based coaching platform with virtual coaches | Supports a healthy lifestyle for older adults | Computer | Older adults (aged ≥55 years) | 18 | Think aloud and observations | 60 | 60 |
5 | Pandit | Web-based application | Allows self-management of health by providing insulin dosing advice | Computer | Patients with type 2 diabetes (aged 40-64 years) | 5 | Concurrent think aloud and observations | 15 | 28 |
6 | Pregnancy and Work | Informational application | Provides information on health risks and regulations during pregnancy | Smartphone | Pregnant women (aged 25-40 years) | 12 | Concurrent think aloud and observations | 45 | 84 |
7 | FatSecret | Calorie counter application | Provides nutritional information | Smartphone | Older adults with type 2 diabetes (aged ≥55 years) | 10 | Concurrent think aloud and observations | 15 | 41 |
8 | Hospitality app | Patient hospitality app | Provides information on how to prepare for a visit to medical facilities | Smartphone | Prefrail and frail older adults (aged ≥65 years) | 8 | Concurrent think aloud and observations | 30 | 18 |
aPrefrail refers to the initial state of a health condition called
bN/A: not applicable.
cCOPD: chronic obstructive pulmonary disease.
A content analysis was conducted according to the methods of Bengtsson [
First, in the decontextualization phase, 2 researchers (MB and MH) familiarized themselves with the data sets. Then, they independently started an inductive coding process. Each usability problem was assigned a code that represents the usability factor. On the basis of data sets 1, 2, and 3, each researcher developed their own codebook. These two codebooks were discussed and merged in one mutually agreed upon codebook, consisting of 9 main categories and 32 factors. Second, in the recontextualization phase, 2 researchers (MB and MH) independently recoded data sets 1-3 using the new codebook. If they found a usability problem that they could not classify using the codebook, a new code was added to the codebook. The resulting codebooks were then compared and discussed, leading to an updated codebook. These steps were performed several times until no new codes emerged. Third, in the categorization phase, definitions for each factor in the updated codebook were formulated, which now consisted of 10 categories and 28 factors. Then, a third independent researcher (LVV) familiarized himself with the data, codebook, and definitions. On the basis of triangular findings, alterations were made to the codebook, resulting in 9 categories and 24 factors. Finally, in the compilation phase, data sets 4, 5, and 6 were independently recoded by two researchers (MB and LVV) using the codebook (deductive coding). Discussions revealed that, although no new categories or factors emerged, there was some overlap in the definitions of some categories and factors that caused confusion about which factor to assign to the usability problem. Therefore, the codebook and definitions were adjusted. The final codebook consisted of 8 categories and 22 factors. The intercoder agreement between researchers MB and LVV was determined by coding data sets 7 and 8 and calculating Cohen κ values for both the category and variable levels.
Cohen κ is the most widely used means for measuring the intercoder agreement. However, it has its limitations, especially for nondichotomous variables, a measure of relative rather than absolute agreement [
Validation of the analysis was performed by calculating Cohen κ values for both category and factor levels (
Intercoder agreement expressed as Cohen κ and percent agreement for usability categories and factors.
Data set | Agreement level | ||||
|
Usability category | Usability factor | |||
|
|||||
|
Usability problems, n | 18 | 18 | ||
|
Percent agreement (%) | 72 | 67 | ||
|
Cohen κ | 0.62 | 0.63 | ||
|
|||||
|
Usability problems, n | 41 | 41 | ||
|
Percent agreement (%) | 76 | 66 | ||
|
Cohen κ | 0.67 | 0.62 |
The ontology for usability problems for eHealth applications, which resulted from the coding process, consists of 8 overarching usability categories and 21 factors (
Ontology for usability problems in eHealth applications.
Category of usability problem and usability factor | Type of usability factor | ||
|
|||
|
Technical performance | General | |
|
General system interaction | General | |
|
|||
|
Fit between system and context of use | General | |
|
Fit between system and user | General | |
|
Fit between system and health goals | eHealth-specific | |
|
|||
|
Accommodation to perceptual impairments or limitations | eHealth-specific | |
|
Accommodation to physical impairments or limitations | eHealth-specific | |
|
Accommodation to cognitive impairments or limitations | eHealth-specific | |
|
|||
|
Design clarity | General | |
|
Symbols, icons, and buttons | General | |
|
Interface organization | General | |
|
Readability of texts | General | |
|
|||
|
Navigation | General | |
|
Structure | General | |
|
|||
|
System information | General | |
|
Health-related information | eHealth-specific | |
|
|||
|
Error management | General | |
|
Procedural system information | General | |
|
Procedural health-related information | eHealth-specific | |
|
|||
|
Satisfaction with system | General | |
|
Satisfaction with system’s ability to support health goals | eHealth-specific |
This category includes usability problems related to the system’s technical stability and the user-system interaction. The factor
Technical problems, such as nonresponsive buttons, can negatively affect efficient system interaction and perceived ease of use. These system errors can seriously hinder task completion and influence users’ opinions of other usability aspects. For example, if page load time takes too long (data set 1, usability problem number 19), a user can also give low ratings to the system’s ease of use, navigation, or satisfaction. Good technical performance of the system is essential to facilitate smooth and easy user-system interaction.
Usability problems found in this category address the match between the system on the one hand, and the user, their context, and health goals, on the other hand. As such, this category is related to the model of Goodhue and Thompson [
The category
We were aware that the category
The fourth category,
This category describes usability problems related to the simplicity and intuitiveness with which a user can move between different system components and a general understanding of the different system components. The factor
This category consists of explanatory, nonaction-related system information and terminology in the app. Usability problems can include issues with understanding labels or terminology, the level of language, or the use of a foreign language. In this category, we made a distinction between system and health-related information. The first type includes information about the understandability of explanatory, nonaction-related information and terminology about the system, such as the use of nonnative language (eg,
The
This final category concerns the user’s satisfaction with the system and addresses the subjective opinion of the user on, or likeability of, an eHealth application. System satisfaction is one of the standard usability variables according to the ISO (International Organization for Standardization) definition [
The eHealth usability ontology includes a total of 21 usability factors, of which 7 are eHealth-specific and 14 are context-independent.
Next, we determined the number of minor, serious, and critical usability problems across the 8 categories (
Number of basic and health usability problems according to severity category.
Factor type | Usability problems (n=400), n (%) | Severity category, n (%) | ||
|
|
Minor (n=186) | Serious (n=147) | Critical (n=67) |
Basic | 278 (69.5) | 130 (69.9) | 101 (68.7) | 47 (70.1) |
Health | 122 (30.5) | 56 (30.1) | 46 (31.3) | 20 (29.9) |
Number of usability problems of usability categories according to severity level.
Usability category | Severity category | Total (n=400), n (%) | |||
|
Minor usability problems (n=186), n (%) | Serious usability problems (n=147), n (%) | Critical usability problems (n=67), n (%) |
|
|
Basic system performance | 32 (17.2) | 10 (6.8) | 14 (20.9) | 56 (14) | |
Task-technology fit | 16 (8.6) | 7 (4.8) | 5 (7.5) | 28 (7) | |
Accessibility | 2 (1.1) | 5 (3.4) | 1 (1.5) | 8 (2) | |
Interface design | 51 (27.4) | 38 (25.9) | 7 (10.4) | 96 (24) | |
Navigation and structure | 12 (6.4) | 18 (12.2) | 12 (17.9) | 42 (10.5) | |
Information and terminology | 13 (6.9) | 13 (8.8) | 1 (1.5) | 27 (6.7) | |
Guidance and support | 56 (30.1) | 55 (37.4) | 25 (37.3) | 136 (34) | |
Satisfaction | 4 (2.1) | 1 (0.7) | 2 (3) | 7 (1.7) |
On the basis of the results of this study, we can reconceptualize the traditional concept of usability in the eHealth context. Our analysis of usability problems in eHealth applications identified 8 main categories for eHealth usability: (1) basic system performance, (2) task-technology fit, (3) accessibility, (4) interface design, (5) navigation and structure, (6) information and terminology, (7) guidance and support, and (8) satisfaction. In each usability category, we made distinctions between factors that were related to general usability (basic usability factors) and those related to the health goals of the system, the medical context, or the characteristics of the intended patient group (health usability factors). We identified 14 general factors and 7 eHealth-specific factors from the analysis. Further examination of the number of usability problems between general and eHealth-specific usability factors revealed that 69.5% (238/400) of all usability problems were related to general factors and 30.5% (122/400) to eHealth-specific factors. When looking at the severity categories (minor, serious, and critical), we observed the same distribution (70:30) between these two types of factors. This implies that when one applies a general usability benchmarking instrument for evaluating eHealth applications, such as the SUS [
The finding that the context, be it eGovernment, eCommerce, or eHealth affects usability is, of course, not surprising. Context has been a prominent factor in the definition of usability since the emergence of this construct [
First, Voncken-Brewster et al [
With regard to the similarities between, on the one hand, our conceptualization of usability for eHealth and, on the other hand, usability questionnaires, such as the PSSUQ [
This study had two main limitations. First, we intended to include data sets from a wide variety of eHealth application designed for different end-user groups. This was deemed necessary, as we wanted to develop a framework for eHealth applications in general. However, the eHealth applications that we included were, although quite diverse in nature, largely intended for middle-aged or older adults (aged ≥40 years). eHealth applications for other age groups, such as adolescents, could have specific usability problems that are underrepresented in this framework. Future research should determine if these other target groups have other common usability problems that need to be included in the eHealth usability ontology. Second, the Cohen κ values of the intercoder agreement were, although sufficient, not strong. One reason for the low Cohen κ scores is that usability problems were often ambiguously formulated. Although we excluded many of these problems beforehand, during coding it became notable that the researchers had different opinions about the origins of some problems. This is not completely avoidable in qualitative research but does highlight the common problem in usability evaluation studies: the evaluator effect [
The current set of usability benchmarking instruments only provides a limited overview of the usability of eHealth applications, as they do not consider eHealth-specific factors. Our reconceptualization of usability in the eHealth context will help practitioners and researchers better understand the usability problems they encounter in their evaluations and develop suitable benchmarking tools.
general user interface
Health Information Technology Usability Evaluation Scale
International Organization for Standardization
Mental Health App Usability Questionnaire
Poststudy System Usability Questionnaire
system usability scale
None declared.