TY - JOUR
T1 - A multi-platform reference for somatic structural variation detection
AU - Espejo Valle-Inclan, Jose
AU - Besselink, Nicolle J M
AU - de Bruijn, Ewart
AU - Cameron, Daniel L
AU - Ebler, Jana
AU - Kutzera, Joachim
AU - van Lieshout, Stef
AU - Marschall, Tobias
AU - Nelen, Marcel
AU - Priestley, Peter
AU - Renkens, Ivo
AU - Roemer, Margaretha G M
AU - van Roosmalen, Markus J
AU - Wenger, Aaron M
AU - Ylstra, Bauke
AU - Fijneman, Remond J A
AU - Kloosterman, Wigard P
AU - Cuppen, Edwin
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2022/6/8
Y1 - 2022/6/8
N2 - Accurate detection of somatic structural variation (SV) in cancer genomes remains a challenging problem. This is in part due to the lack of high-quality, gold-standard datasets that enable the benchmarking of experimental approaches and bioinformatic analysis pipelines. Here, we performed somatic SV analysis of the paired melanoma and normal lymphoblastoid COLO829 cell lines using four different sequencing technologies. Based on the evidence from multiple technologies combined with extensive experimental validation, we compiled a comprehensive set of carefully curated and validated somatic SVs, comprising all SV types. We demonstrate the utility of this resource by determining the SV detection performance as a function of tumor purity and sequence depth, highlighting the importance of assessing these parameters in cancer genomics projects. The truth somatic SV dataset as well as the underlying raw multi-platform sequencing data are freely available and are an important resource for community somatic benchmarking efforts.
AB - Accurate detection of somatic structural variation (SV) in cancer genomes remains a challenging problem. This is in part due to the lack of high-quality, gold-standard datasets that enable the benchmarking of experimental approaches and bioinformatic analysis pipelines. Here, we performed somatic SV analysis of the paired melanoma and normal lymphoblastoid COLO829 cell lines using four different sequencing technologies. Based on the evidence from multiple technologies combined with extensive experimental validation, we compiled a comprehensive set of carefully curated and validated somatic SVs, comprising all SV types. We demonstrate the utility of this resource by determining the SV detection performance as a function of tumor purity and sequence depth, highlighting the importance of assessing these parameters in cancer genomics projects. The truth somatic SV dataset as well as the underlying raw multi-platform sequencing data are freely available and are an important resource for community somatic benchmarking efforts.
KW - benchmarking
KW - cancer
KW - long sequencing read
KW - short sequencing read
KW - structural variant
KW - truth set
KW - whole-genome sequencing
UR - http://www.scopus.com/inward/record.url?scp=85136914557&partnerID=8YFLogxK
U2 - 10.1016/j.xgen.2022.100139
DO - 10.1016/j.xgen.2022.100139
M3 - Article
C2 - 36778136
SN - 2666-979X
VL - 2
JO - Cell genomics
JF - Cell genomics
IS - 6
M1 - 100139
ER -