Skip to main navigation Skip to search Skip to main content

Radiomic-Based Machine Learning Classifiers for HPV Status Prediction in Oropharyngeal Cancer: A Systematic Review and Meta-Analysis

  • Anna Luíza Damaceno Araújo*
  • , Luiz Paulo Kowalski
  • , Alan Roger Santos-Silva
  • , Brendo Vinícius Rodrigues Louredo
  • , Cristina Saldivia-Siracusa
  • , Otávio Augusto A M de Melo
  • , Deivid Cabral
  • , Andrés Coca-Pelaz
  • , Orlando Guntinas-Lichius
  • , Remco de Bree
  • , Pawel Golusinski
  • , Karthik N Rao
  • , Robert P Takes
  • , Nabil F Saba
  • , Alfio Ferlito
  • *Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

Abstract

Background: The aim of the present systematic review (SR) is to compile evidence regarding the use of radiomic-based machine learning (ML) models for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC) patients and to assess their reliability, methodological frameworks, and clinical applicability. The SR was conducted following PRISMA 2020 guidelines and registered in PROSPERO (CRD42025640065). Methods: Using the PICOS framework, the review question was defined as follows: "Can radiomic-based ML models accurately predict HPV status in OPSCC?" Electronic databases (Cochrane, Embase, IEEE Xplore, BVS, PubMed, Scopus, Web of Science) and gray literature (arXiv, Google Scholar and ProQuest) were searched. Retrospective cohort studies assessing radiomics for HPV prediction were included. Risk of bias (RoB) was evaluated using Prediction model Risk Of Bias ASsessment Tool (PROBAST), and data were synthesized based on imaging modality, architecture type/learning modalities, and the presence of external validation. Meta-analysis was performed for externally validated models using MetaBayesDTA and RStudio. Results: Twenty-four studies including 8627 patients were analyzed. Imaging modalities included computed tomography (CT), magnetic resonance imaging (MRI), contrast-enhanced computed tomography (CE-CT), and 18F-fluorodeoxyglucose positron emission tomography (18F-FDG PET). Logistic regression, random forest, eXtreme Gradient Boosting (XGBoost), and convolutional neural networks (CNNs) were commonly used. Most datasets were imbalanced with a predominance of HPV+ cases. Only eight studies reported external validation results. AUROC values ranged between 0.59 and 0.87 in the internal validation and between 0.48 and 0.91 in the external validation results. RoB was high in most studies, mainly due to reliance on p16-only HPV testing, insufficient events, or inadequate handling of class imbalance. Deep Learning (DL) models achieved moderate performance with considerable heterogeneity (sensitivity: 0.61; specificity: 0.65). In contrast, traditional models provided higher, more consistent performance (sensitivity: 0.72; specificity: 0.77). Conclusions: Radiomic-based ML models show potential for HPV status prediction in OPSCC, but methodological heterogeneity and a high RoB limit current clinical applicability.

Original languageEnglish
Article number68
JournalDiagnostics
Volume16
Issue number1
DOIs
Publication statusPublished - Jan 2026

Keywords

  • HPV
  • imaging
  • machine learning
  • oropharyngeal cancer
  • radiomics

Fingerprint

Dive into the research topics of 'Radiomic-Based Machine Learning Classifiers for HPV Status Prediction in Oropharyngeal Cancer: A Systematic Review and Meta-Analysis'. Together they form a unique fingerprint.

Cite this