TY - JOUR
T1 - Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs
AU - Kuronen, Juri
AU - Horsfield, Samuel T.
AU - Pöntinen, Anna K.
AU - Mallawaarachchi, Sudaraka
AU - Arredondo-Alonso, Sergio
AU - Thorpe, Harry
AU - Gladstone, Rebecca A.
AU - Willems, Rob J.L.
AU - Bentley, Stephen D.
AU - Croucher, Nicholas J.
AU - Pensar, Johan
AU - Lees, John A.
AU - Tonkin-Hill, Gerry
AU - Corander, Jukka
N1 - Publisher Copyright:
© 2024 Kuronen et al.
PY - 2024/7
Y1 - 2024/7
N2 - Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis.
AB - Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis.
UR - http://www.scopus.com/inward/record.url?scp=85201851864&partnerID=8YFLogxK
U2 - 10.1101/gr.278485.123
DO - 10.1101/gr.278485.123
M3 - Article
C2 - 39134411
AN - SCOPUS:85201851864
SN - 1088-9051
VL - 34
SP - 1081
EP - 1088
JO - Genome Research
JF - Genome Research
IS - 7
ER -