TY - JOUR
T1 - Impact of deep learning model uncertainty on manual corrections to MRI-based auto-segmentation in prostate cancer radiotherapy
AU - Rogowski, Viktor
AU - Svalkvist, Angelica
AU - Maspero, Matteo
AU - Janssen, Tomas
AU - Maruccio, Federica Carmen
AU - Gorgisyan, Jenny
AU - Scherman, Jonas
AU - Häggström, Ida
AU - Wåhlstrand, Victor
AU - Gunnlaugsson, Adalsteinn
AU - Nilsson, Martin P
AU - Moreau, Mathieu
AU - Vass, Nándor
AU - Pettersson, Niclas
AU - Gustafsson, Christian Jamtheim
N1 - Publisher Copyright:
© 2025 The Author(s). Journal of Applied Clinical Medical Physics published by Wiley Periodicals, LLC on behalf of The American Association of Physicists in Medicine.
PY - 2025/9
Y1 - 2025/9
N2 - BACKGROUND: Deep learning (DL)-based organ segmentation is increasingly used in radiotherapy. While methods exist to generate voxel-wise uncertainty maps from DL-based auto-segmentation models, these maps are rarely presented to clinicians.PURPOSE: This study aimed to evaluate the impact of DL-generated uncertainty maps on experienced radiation oncologists during the manual correction of DL-based auto-segmentation for prostate radiotherapy.METHODS: Two nnUNet DL models were trained with 10-fold cross-validation on a dataset of 434 patient cases undergoing ultra-hypofractionated MRI-only radiotherapy for prostate cancer. The models performed prostate clinical target volume (CTV) and rectum segmentation. Each cross-validation model was evaluated on an independent test set of 35 patient cases. Segmentation uncertainty was calculated voxel-wise as the SoftMax standard deviation (0-0.5, n = 10) and visualized as a fixed scale color-coded map. Four experienced oncologists were asked to: Step 1: Rate the quality of and confidence in the DL segmentations using a four- and five-point Likert scale, respectively, and edit the segmentations without access to the uncertainty map. Step 2: Repeat step 1 after at least 4 weeks, but this time with the color-coded uncertainty map available. Oncologists were asked to blend the uncertainty map with the DL segmentation and MRI volume. Segmentation edit time was recorded for both steps. In step 2, oncologists also provided free-text feedback on the benefits and drawbacks of using the uncertainty map during segmentation. A histogram analysis was performed to compare the number of voxels edited between step 1 and step 2 for different uncertainty levels (bins with 0.1 intervals).RESULTS: The DL models achieved high-quality segmentations with a mean Dice coefficient per oncologist of 0.97-0.99, calculated between edited and unedited segmentation in step 1 for the prostate CTV and rectum. While the overall quality rating for rectum segmentations decreased slightly on a group level in step 2 compared to step 1, individual responses varied. Some oncologists rated the quality higher for the prostate CTV segmentation with the uncertainty map present, while others rated it lower. Similarly, confidence ratings varied across oncologists for prostate CTV and rectum. Decreased segmentation time was recorded for three oncologists using uncertainty maps, saving 1-2 min per patient case, corresponding to 14%-33% time reduction. Three oncologists found the uncertainty maps helpful, and one reported benefit was the ability to identify regions of interest more quickly. The histogram analysis had fewer voxel edits in regions of low uncertainty in step 2 compared to step 1. Specifically, 50% fewer voxel edits were recorded for the uncertainty region 0.0-0.1, suggesting increased trust in the DL model's prediction in these areas.CONCLUSIONS: Presenting DL uncertainty information to experienced radiation oncologists influences their decision-making, quality perception, and confidence in the DL segmentations. Regions with low uncertainty were less likely to be edited, indicating increased reliance on the model's predictions. Additionally, uncertainty maps can improve efficiency by reducing segmentation time. DL-based segmentation uncertainty can be a valuable tool in clinical practice, enhancing the efficiency of radiotherapy planning.
AB - BACKGROUND: Deep learning (DL)-based organ segmentation is increasingly used in radiotherapy. While methods exist to generate voxel-wise uncertainty maps from DL-based auto-segmentation models, these maps are rarely presented to clinicians.PURPOSE: This study aimed to evaluate the impact of DL-generated uncertainty maps on experienced radiation oncologists during the manual correction of DL-based auto-segmentation for prostate radiotherapy.METHODS: Two nnUNet DL models were trained with 10-fold cross-validation on a dataset of 434 patient cases undergoing ultra-hypofractionated MRI-only radiotherapy for prostate cancer. The models performed prostate clinical target volume (CTV) and rectum segmentation. Each cross-validation model was evaluated on an independent test set of 35 patient cases. Segmentation uncertainty was calculated voxel-wise as the SoftMax standard deviation (0-0.5, n = 10) and visualized as a fixed scale color-coded map. Four experienced oncologists were asked to: Step 1: Rate the quality of and confidence in the DL segmentations using a four- and five-point Likert scale, respectively, and edit the segmentations without access to the uncertainty map. Step 2: Repeat step 1 after at least 4 weeks, but this time with the color-coded uncertainty map available. Oncologists were asked to blend the uncertainty map with the DL segmentation and MRI volume. Segmentation edit time was recorded for both steps. In step 2, oncologists also provided free-text feedback on the benefits and drawbacks of using the uncertainty map during segmentation. A histogram analysis was performed to compare the number of voxels edited between step 1 and step 2 for different uncertainty levels (bins with 0.1 intervals).RESULTS: The DL models achieved high-quality segmentations with a mean Dice coefficient per oncologist of 0.97-0.99, calculated between edited and unedited segmentation in step 1 for the prostate CTV and rectum. While the overall quality rating for rectum segmentations decreased slightly on a group level in step 2 compared to step 1, individual responses varied. Some oncologists rated the quality higher for the prostate CTV segmentation with the uncertainty map present, while others rated it lower. Similarly, confidence ratings varied across oncologists for prostate CTV and rectum. Decreased segmentation time was recorded for three oncologists using uncertainty maps, saving 1-2 min per patient case, corresponding to 14%-33% time reduction. Three oncologists found the uncertainty maps helpful, and one reported benefit was the ability to identify regions of interest more quickly. The histogram analysis had fewer voxel edits in regions of low uncertainty in step 2 compared to step 1. Specifically, 50% fewer voxel edits were recorded for the uncertainty region 0.0-0.1, suggesting increased trust in the DL model's prediction in these areas.CONCLUSIONS: Presenting DL uncertainty information to experienced radiation oncologists influences their decision-making, quality perception, and confidence in the DL segmentations. Regions with low uncertainty were less likely to be edited, indicating increased reliance on the model's predictions. Additionally, uncertainty maps can improve efficiency by reducing segmentation time. DL-based segmentation uncertainty can be a valuable tool in clinical practice, enhancing the efficiency of radiotherapy planning.
KW - Algorithms
KW - Deep Learning
KW - Humans
KW - Image Processing, Computer-Assisted/methods
KW - Magnetic Resonance Imaging/methods
KW - Male
KW - Organs at Risk/radiation effects
KW - Prostatic Neoplasms/radiotherapy
KW - Radiotherapy Dosage
KW - Radiotherapy Planning, Computer-Assisted/methods
KW - Radiotherapy, Intensity-Modulated/methods
KW - Uncertainty
U2 - 10.1002/acm2.70221
DO - 10.1002/acm2.70221
M3 - Article
C2 - 40849835
SN - 1526-9914
VL - 26
JO - Journal of applied clinical medical physics
JF - Journal of applied clinical medical physics
IS - 9
M1 - e70221
ER -