TY - JOUR
T1 - International Interobserver Variability of Breast Density Assessment
AU - Portnow, Leah H
AU - Choridah, Lina
AU - Kardinah, Kardinah
AU - Handarini, Triwulan
AU - Pijnappel, Ruud
AU - Bluekens, Adriana M J
AU - Duijm, Lucien E M
AU - Schoub, Peter K
AU - Smilg, Pamela S
AU - Malek, Liat
AU - Leung, Jessica W T
AU - Raza, Sughra
N1 - Funding Information:
The authors would like to deeply thank the invalueable contributions of statistician Camden P. Bay, PhD; formerly of the Department of Radiology, Brigham and Women's Hospital, Boston, Massachusetts.
Publisher Copyright:
© 2023 American College of Radiology
PY - 2023/7
Y1 - 2023/7
N2 - Purpose: The aim of this study was to determine variability in visually assessed mammographic breast density categorization among radiologists practicing in Indonesia, the Netherlands, South Africa, and the United States. Methods: Two hundred consecutive 2-D full-field digital screening mammograms obtained from September to December 2017 were selected and retrospectively reviewed from four global locations, for a total of 800 mammograms. Three breast radiologists in each location (team) provided consensus density assessments of all 800 mammograms using BI-RADS® density categorization. Interreader agreement was compared using Gwet's AC2 with quadratic weighting across all four density categories and Gwet's AC1 for binary comparison of combined not dense versus dense categories. Variability of distribution among teams was calculated using the Stuart-Maxwell test of marginal homogeneity across all four categories and using the McNemar test for not dense versus dense categories. To compare readers from a particular country on their own 200 mammograms versus the other three teams, density distribution was calculated using conditional logistic regression. Results: For all 800 mammograms, interreader weighted agreement for distribution among four density categories was 0.86 (Gwet's AC2 with quadratic weighting; 95% confidence interval, 0.85-0.88), and for not dense versus dense categories, it was 0.66 (Gwet's AC1; 95% confidence interval, 0.63-0.70). Density distribution across four density categories was significantly different when teams were compared with one another and one team versus the other three teams combined (P < .001). Overall, all readers placed the largest number of mammograms in the scattered and heterogeneous categories. Conclusions: Although reader teams from four different global locations had almost perfect interreader agreement in BI-RADS density categorization, variability in density distribution across four categories remained statistically significant.
AB - Purpose: The aim of this study was to determine variability in visually assessed mammographic breast density categorization among radiologists practicing in Indonesia, the Netherlands, South Africa, and the United States. Methods: Two hundred consecutive 2-D full-field digital screening mammograms obtained from September to December 2017 were selected and retrospectively reviewed from four global locations, for a total of 800 mammograms. Three breast radiologists in each location (team) provided consensus density assessments of all 800 mammograms using BI-RADS® density categorization. Interreader agreement was compared using Gwet's AC2 with quadratic weighting across all four density categories and Gwet's AC1 for binary comparison of combined not dense versus dense categories. Variability of distribution among teams was calculated using the Stuart-Maxwell test of marginal homogeneity across all four categories and using the McNemar test for not dense versus dense categories. To compare readers from a particular country on their own 200 mammograms versus the other three teams, density distribution was calculated using conditional logistic regression. Results: For all 800 mammograms, interreader weighted agreement for distribution among four density categories was 0.86 (Gwet's AC2 with quadratic weighting; 95% confidence interval, 0.85-0.88), and for not dense versus dense categories, it was 0.66 (Gwet's AC1; 95% confidence interval, 0.63-0.70). Density distribution across four density categories was significantly different when teams were compared with one another and one team versus the other three teams combined (P < .001). Overall, all readers placed the largest number of mammograms in the scattered and heterogeneous categories. Conclusions: Although reader teams from four different global locations had almost perfect interreader agreement in BI-RADS density categorization, variability in density distribution across four categories remained statistically significant.
KW - Breast density
KW - global radiology
KW - mammography
UR - http://www.scopus.com/inward/record.url?scp=85160339113&partnerID=8YFLogxK
U2 - 10.1016/j.jacr.2023.03.010
DO - 10.1016/j.jacr.2023.03.010
M3 - Article
C2 - 37127220
SN - 1546-1440
VL - 20
SP - 671
EP - 684
JO - Journal of the American College of Radiology
JF - Journal of the American College of Radiology
IS - 7
ER -