TY - JOUR
T1 - Validation of a candidate instrument to assess image quality in digital mammography using ROC analysis
AU - Boita, Joana
AU - van Engen, Ruben E
AU - Mackenzie, Alistair
AU - Tingberg, Anders
AU - Bosmans, Hilde
AU - Bolejko, Anetta
AU - Zackrisson, Sophia
AU - Wallis, Matthew G
AU - Ikeda, Debra M
AU - van Ongeval, Chantal
AU - Pijnappel, Ruud
AU - Broeders, Mireille
AU - Sechopoulos, Ioannis
N1 - Funding Information:
Alistair Mackenzie was funded as part of the OPTIMAM2 project and is supported by Cancer Research UK (grant, number: C30682/A17321 ).
Funding Information:
The VISUAL group, which is the group that includes all the observers that participated in the project from which this study is part of: F. Jansen, L. Duijm, H. de Bruin, A. Bluekens, I. Andersson, C. Behmer, E. Johansson, K. Rönnow, K. Taylor, F. Kilburn-Toppin, P. Moyle, M. van Goethem, R. Prevos, A. van Steen, N. Salem, S. Pal, E. Rosen, H. Lelivelt, K. Michielsen, L. Cockmartin, N. Phelan, P. Baldelli, S. Schopphoven. The authors thank the Medical Physics Department, Royal Surrey NHS Foundation Trust for the use of mammograms from the OPTIMAM Mammography Image Database funded by Cancer Research UK (C30682/A28396), Sander van Woudenberg for all the help with the image processing, and Gustav Hellgren for all the help with the setup of the observer studies performed by the Swedish readers.
Publisher Copyright:
© 2021 The Author(s)
PY - 2021/6
Y1 - 2021/6
N2 - PURPOSE: To validate a candidate instrument, to be used by different professionals to assess image quality in digital mammography (DM), against detection performance results.METHODS: A receiver operating characteristics (ROC) study was conducted to assess the detection performance in DM images with four different image quality levels due to different quality issues. Fourteen expert breast radiologists from five countries assessed a set of 80 DM cases, containing 60 lesions (40 cancers, 20 benign findings) and 20 normal cases. A visual grading analysis (VGA) study using a previously-described candidate instrument was conducted to evaluate a subset of 25 of the images used in the ROC study. Eight radiologists that had participated in the ROC study, and seven expert breast-imaging physicists, evaluated this subset. The VGA score (VGAS) and the ROC and visual grading characteristics (VGC) areas under the curve (AUCROC and AUCVGC) were compared.RESULTS: No large differences in image quality among the four levels were detected by either ROC or VGA studies. However, the ranking of the four levels was consistent: level 1 (partial AUCROC: 0.070, VGAS: 6.77) performed better than levels 2 (0.066, 6.15), 3 (0.061, 5.82), and 4 (0.062, 5.37). Similarity between radiologists' and physicists' assessments was found (average VGAS difference of 10 %).CONCLUSIONS: The results from the candidate instrument were found to correlate with those from ROC analysis, when used by either observer group. Therefore, it may be used by different professionals, such as radiologists, radiographers, and physicists, to assess clinically-relevant image quality variations in DM.
AB - PURPOSE: To validate a candidate instrument, to be used by different professionals to assess image quality in digital mammography (DM), against detection performance results.METHODS: A receiver operating characteristics (ROC) study was conducted to assess the detection performance in DM images with four different image quality levels due to different quality issues. Fourteen expert breast radiologists from five countries assessed a set of 80 DM cases, containing 60 lesions (40 cancers, 20 benign findings) and 20 normal cases. A visual grading analysis (VGA) study using a previously-described candidate instrument was conducted to evaluate a subset of 25 of the images used in the ROC study. Eight radiologists that had participated in the ROC study, and seven expert breast-imaging physicists, evaluated this subset. The VGA score (VGAS) and the ROC and visual grading characteristics (VGC) areas under the curve (AUCROC and AUCVGC) were compared.RESULTS: No large differences in image quality among the four levels were detected by either ROC or VGA studies. However, the ranking of the four levels was consistent: level 1 (partial AUCROC: 0.070, VGAS: 6.77) performed better than levels 2 (0.066, 6.15), 3 (0.061, 5.82), and 4 (0.062, 5.37). Similarity between radiologists' and physicists' assessments was found (average VGAS difference of 10 %).CONCLUSIONS: The results from the candidate instrument were found to correlate with those from ROC analysis, when used by either observer group. Therefore, it may be used by different professionals, such as radiologists, radiographers, and physicists, to assess clinically-relevant image quality variations in DM.
KW - Digital mammography
KW - Image quality evaluation
KW - Receiver operating characteristic
KW - Type testing
KW - Validation study
KW - Visual grading analysis
UR - http://www.scopus.com/inward/record.url?scp=85103594392&partnerID=8YFLogxK
U2 - 10.1016/j.ejrad.2021.109686
DO - 10.1016/j.ejrad.2021.109686
M3 - Article
C2 - 33819803
SN - 0720-048X
VL - 139
JO - European Journal of Radiology
JF - European Journal of Radiology
M1 - 109686
ER -