ReQuant: improved base modification calling by k-mer value imputation

Roy Straver, Carlo Vermeulen, Joe R Verity-Legg, Marc Pagès-Gallego, Dieter G G Stoker, Alexander van Oudenaarden, Jeroen de Ridder*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Downloads (Pure)

Abstract

Nanopore sequencing allows identification of base modifications, such as methylation, directly from raw current data. Prevailing approaches, including deep learning (DL) methods, require training data covering all possible sequence contexts. These data can be prohibitively expensive or impossible to obtain for some modifications. Hence, research into DNA modifications focuses on the most prevalent modification in human DNA: 5mC in a CpG context. Improved generalization is required to reach the technology's full potential: calling any modification from raw current values. We developed ReQuant, an algorithm to impute full, k-mer based, modification models from limited k-mer context training data. ReQuant is highly accurate for calling modifications (CpG/GpC methylation and CpG glucosylation) in Lambda Phage R9 data when fitting on ≤25% of all possible 6-mers with a modification and extends to human R10 data. The success of our approach shows that DNA modifications have a consistent and therefore predictable effect on Nanopore current levels, suggesting that interpretable rule-based imputation in unseen contexts is possible. Our approach circumvents the need for modification-specific DL tools and enables modification calling when not all sequence contexts can be obtained, opening a vast field of biological base modification research.

Original languageEnglish
Article numbergkaf323
JournalNucleic acids research
Volume53
Issue number9
DOIs
Publication statusPublished - 22 May 2025

Keywords

  • Algorithms
  • Bacteriophage lambda/genetics
  • CpG Islands
  • DNA Methylation
  • DNA/chemistry
  • Deep Learning
  • Humans
  • Nanopore Sequencing/methods
  • Sequence Analysis, DNA/methods
  • Software

Fingerprint

Dive into the research topics of 'ReQuant: improved base modification calling by k-mer value imputation'. Together they form a unique fingerprint.

Cite this