TY - JOUR
T1 - Detecting co-selection through excess linkage disequilibrium in bacterial genomes
AU - Mallawaarachchi, Sudaraka
AU - Tonkin-Hill, Gerry
AU - Pöntinen, Anna K
AU - Calland, Jessica K
AU - Gladstone, Rebecca A
AU - Arredondo-Alonso, Sergio
AU - MacAlasdair, Neil
AU - Thorpe, Harry A
AU - Top, Janetta
AU - Sheppard, Samuel K
AU - Balding, David
AU - Croucher, Nicholas J
AU - Corander, Jukka
N1 - Publisher Copyright:
© 2024 The Author(s). Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
PY - 2024/6
Y1 - 2024/6
N2 - Population genomics has revolutionized our ability to study bacterial evolution by enabling data-driven discovery of the genetic architecture of trait variation. Genome-wide association studies (GWAS) have more recently become accompanied by genome-wide epistasis and co-selection (GWES) analysis, which offers a phenotype-free approach to generating hypotheses about selective processes that simultaneously impact multiple loci across the genome. However, existing GWES methods only consider associations between distant pairs of loci within the genome due to the strong impact of linkage-disequilibrium (LD) over short distances. Based on the general functional organisation of genomes it is nevertheless expected that majority of co-selection and epistasis will act within relatively short genomic proximity, on co-variation occurring within genes and their promoter regions, and within operons. Here, we introduce LDWeaver, which enables an exhaustive GWES across both short- and long-range LD, to disentangle likely neutral co-variation from selection. We demonstrate the ability of LDWeaver to efficiently generate hypotheses about co-selection using large genomic surveys of multiple major human bacterial pathogen species and validate several findings using functional annotation and phenotypic measurements. Our approach will facilitate the study of bacterial evolution in the light of rapidly expanding population genomic data.
AB - Population genomics has revolutionized our ability to study bacterial evolution by enabling data-driven discovery of the genetic architecture of trait variation. Genome-wide association studies (GWAS) have more recently become accompanied by genome-wide epistasis and co-selection (GWES) analysis, which offers a phenotype-free approach to generating hypotheses about selective processes that simultaneously impact multiple loci across the genome. However, existing GWES methods only consider associations between distant pairs of loci within the genome due to the strong impact of linkage-disequilibrium (LD) over short distances. Based on the general functional organisation of genomes it is nevertheless expected that majority of co-selection and epistasis will act within relatively short genomic proximity, on co-variation occurring within genes and their promoter regions, and within operons. Here, we introduce LDWeaver, which enables an exhaustive GWES across both short- and long-range LD, to disentangle likely neutral co-variation from selection. We demonstrate the ability of LDWeaver to efficiently generate hypotheses about co-selection using large genomic surveys of multiple major human bacterial pathogen species and validate several findings using functional annotation and phenotypic measurements. Our approach will facilitate the study of bacterial evolution in the light of rapidly expanding population genomic data.
UR - http://www.scopus.com/inward/record.url?scp=85195606628&partnerID=8YFLogxK
U2 - 10.1093/nargab/lqae061
DO - 10.1093/nargab/lqae061
M3 - Article
C2 - 38846349
SN - 2631-9268
VL - 6
JO - NAR genomics and bioinformatics
JF - NAR genomics and bioinformatics
IS - 2
M1 - lqae061
ER -