This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
The most common dermatological complication of insulin therapy is lipohypertrophy.
As a proof of concept, we built and tested an automated model using a convolutional neural network (CNN) to detect the presence of lipohypertrophy in ultrasound images.
Ultrasound images were obtained in a blinded fashion using a portable GE LOGIQ
The DenseNet CNN architecture was found to have the highest accuracy (76%) and recall (76%) in detecting lipohypertrophy in ultrasound images compared to other CNN architectures. Additional work showed that the YOLOv5m object detection model could be used to help detect the approximate location of lipohypertrophy in ultrasound images identified as containing lipohypertrophy by the DenseNet CNN.
We were able to demonstrate the ability of machine learning approaches to automate the process of detecting and locating lipohypertrophy.
The most common dermatological complication of insulin therapy for glycemic control in diabetes is lipohypertrophy, which has a prevalence ranging from approximately 25% to 65% in the literature [
The development of machine learning techniques to predict masses in ultrasound images has been an ongoing effort in clinical practice for the past few decades. To assist physicians in diagnosing disease, many scholars have implemented techniques such as regression, decision trees, Naive Bayesian classifiers, and neural networks on patients’ ultrasound imaging data [
In an effort to improve the accessibility and efficiency of this method of detection, we have, as a proof of concept, developed a supervised machine learning algorithm to detect lipohypertrophy in ultrasound images using a CNN and a web-based application to deploy the trained models and make accurate predictions on the presence or absence of lipohypertrophy in ultrasound images.
All images were obtained from research participants who were enrolled in a diabetes education program at an academic center and who had an unknown lipohypertrophy status between July 2015 and March 2017 as part of a previous study of this condition [
All research participants gave written consent, and our study protocol received approval by the Human Subjects Committee of the University of British Columbia (H20-03979).
Before beginning any model training, the data were split into train, validation, and test splits of 70%, 15%, and 15%, respectively, followed by some preprocessing steps of manually removing borders from the nonannotated versions of the images. We included all different types of diabetes as 1 set and did not differentiate between patients when splitting, as the histology of these lesions has been found to be independent of the source of insulin or mode of administration [
Given the small size of the data set, image augmentation techniques were used to expand the size of the training set and improve the model’s generalizability. A variety of classic transformations [
Final image transformations included random vertical and horizontal flipping and random brightness and contrast adjustment.
In addition, we wanted to implement object detection into our pipeline, giving users the opportunity to visually identify the location of lipohypertrophy being detected by our model. To implement object detection using a popular framework called YOLOv5 [
Our images were obtained from a total of 103 participants, of whom 8% were diagnosed with type 1 and 92% were diagnosed with type 2 diabetes (
Each of the potential models (VGG16, ResNet50, DenseNet169, and InceptionV3) were investigated by training them in separate experiments, using our augmented data set.
Research participant characteristics (N=103).
Characteristics | Values |
Age (years), mean (SE) | 75.0 (11.8) |
BMI (kg/m2), mean (SE) | 28.3 (6.1) |
Participant with type 1 diabetes, n | 8 |
Number of years on insulin, mean (SE) | 9.4 (11.5) |
Duration of diabetes (years), mean (SE) | 20.7 (6.1) |
Glycated hemoglobin (%), mean (SE) | 8.0 (1.1) |
Total daily dose (units), mean (SE) | 48.6 (42.9) |
Daily doses, n (range) | 2 (1-6) |
Some examples of images found in our data set. The top row displays negative images (no lipohypertrophy present) and the bottom row displays positive images (lipohypertrophy present) where the yellow annotations indicate the exact area of the mass. The yellow annotations are only for the reader; the images that the model was trained on were unmarked with no yellow annotations.
As shown in
With respect to object detection implementation, the YOLOv5m model was able to identify the specific location of lipohypertrophy in test cases, as demonstrated in
All 4 models (ResNet, VGG16, Inception, and DenseNet) were tested on a holdout sample to produce these accuracy, recall or sensitivity, and specificity results.
Model accuracy scores, recall or sensitivity scores, and specificity scores.
Model | Accuracy scores | Recall or sensitivity scores | Specificity scores |
DenseNet | 0.76 | 0.76 | 0.49 |
Inception | 0.74 | 0.52 | 0.33 |
VGG16 | 0.65 | 0.19 | 0.12 |
ResNet | 0.61 | 0 | 0 |
Our final object detection model results on a test sample reveals promising outcomes. The top row indicates the true location of lipohypertrophy, and the bottom row indicates where the model thinks the lipohypertrophy is. The number on the red box indicates the model’s confidence.
Our results from the YOLOv5m object detection model showcase a successful initial attempt, as shown by our precision (a). Our best F1 score (b) is around 0.78 with a confidence value of about 0.4109. Any higher confidence value causes our recall (c) to suffer dramatically, which was the focus of our optimization.
As a proof of concept, we were able to demonstrate the ability of a supervised machine learning algorithm to detect lipohypertrophy on ultrasound images using a CNN, and we were able to deploy this algorithm though a web-based application to make accurate predictions on the presence or absence of lipohypertrophy in ultrasound images obtained at the point of care. The DenseNet transfer learning architecture outperformed the other architectures tested, suggesting this would be the most appropriate choice to automate the process of detecting and locating lipohypertrophy, a common dermatological complication of insulin injections.
Prediction of masses in ultrasound images using machine learning techniques has been an ongoing effort in clinical practice for the past few decades. To assist physicians in diagnosing disease, many scholars have implemented techniques such as regression, decision trees, Naive Bayesian classifiers, and neural networks on patients’ ultrasound imaging data [
Recent research has delved into various complex image augmentation techniques to generate images [
Although our project has demonstrated in principle that machine learning can be used to detect lipohypertrophy, there are some key limitations that should be addressed before it can be used in a clinical setting. Given the small size of our data set, more images need to be incorporated into the model before it can be used to direct patient care. Besides, even after the addition of new images, an auditing process should also be developed to ensure that our machine learning model does not propagate any biases that could cause harm to specific patient populations.
Previous clinical studies of lipohypertrophy have demonstrated quite a high prevalence of this condition (greater than half). More importantly, they have demonstrated a significant burden of subclinical lesions in patients with diabetes [
convolutional neural network
This work was supported by the Allan M McGavin Foundation. The funder had no role in the production of the manuscript.
JK collected the data. EB, TB, LH, JR, and XY analyzed the data and wrote the manuscript. GM and KM designed the study and wrote the manuscript. KM takes responsibility for the contents of this paper.
None declared.