Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 8:15:1023-1039.
doi: 10.2147/OPTH.S289425. eCollection 2021.

Evaluating the Viability of a Smartphone-Based Annotation Tool for Faster and Accurate Image Labelling for Artificial Intelligence in Diabetic Retinopathy

Affiliations

Evaluating the Viability of a Smartphone-Based Annotation Tool for Faster and Accurate Image Labelling for Artificial Intelligence in Diabetic Retinopathy

Arvind Kumar Morya et al. Clin Ophthalmol. .

Erratum in

Abstract

Introduction: Deep Learning (DL) and Artificial Intelligence (AI) have become widespread due to the advanced technologies and availability of digital data. Supervised learning algorithms have shown human-level performance or even better and are better feature extractor-quantifier than unsupervised learning algorithms. To get huge dataset with good quality control, there is a need of an annotation tool with a customizable feature set. This paper evaluates the viability of having an in house annotation tool which works on a smartphone and can be used in a healthcare setting.

Methods: We developed a smartphone-based grading system to help researchers in grading multiple retinal fundi. The process consisted of designing the flow of user interface (UI) keeping in view feedback from experts. Quantitative and qualitative analysis of change in speed of a grader over time and feature usage statistics was done. The dataset size was approximately 16,000 images with adjudicated labels by a minimum of 2 doctors. Results for an AI model trained on the images graded using this tool and its validation over some public datasets were prepared.

Results: We created a DL model and analysed its performance for a binary referrable DR Classification task, whether a retinal image has Referrable DR or not. A total of 32 doctors used the tool for minimum of 20 images each. Data analytics suggested significant portability and flexibility of the tool. Grader variability for images was in favour of agreement on images annotated. Number of images used to assess agreement is 550. Mean of 75.9% was seen in agreement.

Conclusion: Our aim was to make Annotation of Medical imaging easier and to minimize time taken for annotations without quality degradation. The user feedback and feature usage statistics confirm our hypotheses of incorporation of brightness and contrast variations, green channels and zooming add-ons in correlation to certain disease types. Simulation of multiple review cycles and establishing quality control can boost the accuracy of AI models even further. Although our study aims at developing an annotation tool for diagnosing and classifying diabetic retinopathy fundus images but same concept can be used for fundus images of other ocular diseases as well as other streams of medical science such as radiology where image-based diagnostic applications are utilised.

Keywords: artificial intelligence; deep learning; referrable diabetic retinopathy.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest in this work.

Figures

Figure 1
Figure 1
Flow diagram of the user interface.
Figure 2
Figure 2
The zooming on 3 levels. (A) A default view of image spanning 512 pixels in the largest dimension. (B) Zoom into image of size 768 pixels in the largest dimension. (C) Zoomed in into an image 1024 pixels in the largest dimension.
Figure 3
Figure 3
Color image (A) and its corresponding red-free image (B).
Figure 4
Figure 4
Effect of brightness modification and green channel. (A) Difficulty in locating fovea due to dark macular region. (B) Easier fovea and macula localization. (C) Distinguishing artery and veins in the green channel is easier. (D and E) Contrast change in green channel makes it very easy to assess optic cup and optic disc for glaucoma verification.
Figure 5
Figure 5
Each completed annotation, triggers a call to load the next set 3 images.
Figure 6
Figure 6
Break up our whole dataset into small, mutually exclusive chunks of 1000 images each.
Figure 7
Figure 7
Daily progress graph.
Figure 8
Figure 8
Graph displaying the average-total hourly annotations over 10 months bucketed by hour of the day as the X-axis and number of annotations as the Y-axis. The x-axis represents the day of the week and the yellow line represents the daily _target assigned as per choice ie the number of images graded on that day. We have added these details to the figure legend.
Figure 9
Figure 9
Feature usage of red-free imaging accessed, brightness changes and by how much were they varied, and also the correlation of these with the overall verdict for an image.
Figure 10
Figure 10
Graph showing green channel image is accessed at least 5% of times by the graders out of which grouping by the overall verdict, most of the times the green channel is used for unhealthy cases.
Figure 11
Figure 11
Verifying the stickiness or addictiveness of the tool by plotting the average time taken per annotation (in seconds) by some of the active grader pool to plot the first 100 images versus the last 100 images annotated.
Figure 12
Figure 12
Top 15 signs and diseases by frequency.
None

Similar articles

Cited by

References

    1. Krizhevsky A, Sutskever I, Hinton: G. ImageNet classification with deep convolutional neural networks. NPIS. 2012;1:1097–1105.
    1. LeCun Y, Kavukcuoglu K, Farabet C. Convolutional networks and applications in vision. Proceedings of 2010 IEEE International Symposium on Circuits and Systems; 2010; Paris: 253–256.
    1. Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–252. doi:10.1007/s11263-015-0816-y - DOI
    1. AI system is better than human doctors at predicting breast cancer. Available from: https://www.newscientist.com/article/2228752-ai-system-is-better-than-hu.... Accessed February12, 2021.
    1. Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intell Syst. 2009;24(2):8–12. doi:10.1109/MIS.2009.36 - DOI

LinkOut - more resources

  NODES
COMMUNITY 1
innovation 2
INTERN 1
twitter 2
Verify 3