svMIL: Predicting the pathogenic effect of TAD boundary-disrupting somatic structural variants through multiple instance learning

Marleen M. Nieboer, Jeroen de Ridder*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Motivation: Despite the fact that structural variants (SVs) play an important role in cancer, methods to predict their effect, especially for SVs in non-coding regions, are lacking, leaving them often overlooked in the clinic. Non-coding SVs may disrupt the boundaries of Topologically Associated Domains (TADs), thereby affecting interactions between genes and regulatory elements such as enhancers. However, it is not known when such alterations are pathogenic. Although machine learning techniques are a promising solution to answer this question, representing the large number of interactions that an SV can disrupt in a single feature matrix is not trivial. Results: We introduce svMIL: A method to predict pathogenic TAD boundary-disrupting SV effects based on multiple instance learning, which circumvents the need for a traditional feature matrix by grouping SVs into bags that can contain any number of disruptions. We demonstrate that svMIL can predict SV pathogenicity, measured through same-sample gene expression aberration, for various cancer types. In addition, our approach reveals that somatic pathogenic SVs alter different regulatory interactions than somatic non-pathogenic SVs and germline SVs.

Original languageEnglish
Pages (from-to)I692-I699
JournalBioinformatics
Volume36
DOIs
Publication statusPublished - 1 Dec 2020

Fingerprint

Dive into the research topics of 'svMIL: Predicting the pathogenic effect of TAD boundary-disrupting somatic structural variants through multiple instance learning'. Together they form a unique fingerprint.

Cite this