TY - JOUR
T1 - Recovering escherichia coli plasmids in the absence of long-read sequencing data
AU - Paganini, Julian A.
AU - Plantinga, Nienke L.
AU - Arredondo-alonso, Sergio
AU - Willems, Rob J.L.
AU - Schürch, Anita C.
N1 - Funding Information:
Funding: S.A.‐A. was supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska‐Curie program (grant number 801133).
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/7/28
Y1 - 2021/7/28
N2 - The incidence of infections caused by multidrug-resistant E. coli strains has risen in the past years. Antibiotic resistance in E. coli is often mediated by acquisition and maintenance of plasmids. The study of E. coli plasmid epidemiology and genomics often requires long-read sequencing information, but recently a number of tools that allow plasmid prediction from short-read data have been developed. Here, we reviewed 25 available plasmid prediction tools and categorized them into binary plasmid/chromosome classification tools and plasmid reconstruction tools. We benchmarked six tools (MOB-suite, plasmidSPAdes, gplas, FishingForPlasmids, HyAsP and SCAPP) that aim to reliably reconstruct distinct plasmids, with a special focus on plasmids carrying antibiotic resistance genes (ARGs) such as extended-spectrum beta-lactamase genes. We found that two thirds (n = 425, 66.3%) of all plasmids were correctly reconstructed by at least one of the six tools, with a range of 92 (14.58%) to 317 (50.23%) correctly predicted plasmids. However, the majority of plasmids that carried antibiotic resistance genes (n = 85, 57.8%) could not be completely recovered as distinct plasmids by any of the tools. MOB-suite was the only tool that was able to correctly reconstruct the majority of plasmids (n = 317, 50.23%), and performed best at reconstructing large plasmids (n = 166, 46.37%) and ARG-plasmids (n = 41, 27.9%), but predictions frequently contained chromosome contamination (40%). In contrast, plasmidSPAdes reconstructed the highest fraction of plasmids smaller than 18 kbp (n = 168, 61.54%). Large ARG-plasmids, however, were frequently merged with sequences derived from distinct replicons. Available bioinformatic tools can provide valuable insight into E. coli plasmids, but also have important limitations. This work will serve as a guideline for selecting the most appropriate plasmid reconstruction tool for studies focusing on E. coli plasmids in the absence of long-read sequencing data.
AB - The incidence of infections caused by multidrug-resistant E. coli strains has risen in the past years. Antibiotic resistance in E. coli is often mediated by acquisition and maintenance of plasmids. The study of E. coli plasmid epidemiology and genomics often requires long-read sequencing information, but recently a number of tools that allow plasmid prediction from short-read data have been developed. Here, we reviewed 25 available plasmid prediction tools and categorized them into binary plasmid/chromosome classification tools and plasmid reconstruction tools. We benchmarked six tools (MOB-suite, plasmidSPAdes, gplas, FishingForPlasmids, HyAsP and SCAPP) that aim to reliably reconstruct distinct plasmids, with a special focus on plasmids carrying antibiotic resistance genes (ARGs) such as extended-spectrum beta-lactamase genes. We found that two thirds (n = 425, 66.3%) of all plasmids were correctly reconstructed by at least one of the six tools, with a range of 92 (14.58%) to 317 (50.23%) correctly predicted plasmids. However, the majority of plasmids that carried antibiotic resistance genes (n = 85, 57.8%) could not be completely recovered as distinct plasmids by any of the tools. MOB-suite was the only tool that was able to correctly reconstruct the majority of plasmids (n = 317, 50.23%), and performed best at reconstructing large plasmids (n = 166, 46.37%) and ARG-plasmids (n = 41, 27.9%), but predictions frequently contained chromosome contamination (40%). In contrast, plasmidSPAdes reconstructed the highest fraction of plasmids smaller than 18 kbp (n = 168, 61.54%). Large ARG-plasmids, however, were frequently merged with sequences derived from distinct replicons. Available bioinformatic tools can provide valuable insight into E. coli plasmids, but also have important limitations. This work will serve as a guideline for selecting the most appropriate plasmid reconstruction tool for studies focusing on E. coli plasmids in the absence of long-read sequencing data.
KW - Antibiotic resistance
KW - Bioinformatics
KW - Escherichia coli
KW - Plasmids
KW - WGS
KW - antibiotic resistance
KW - bioinformatics
KW - plasmids
UR - http://www.scopus.com/inward/record.url?scp=85111253595&partnerID=8YFLogxK
U2 - 10.3390/microorganisms9081613
DO - 10.3390/microorganisms9081613
M3 - Article
C2 - 34442692
AN - SCOPUS:85111253595
SN - 2076-2607
VL - 9
SP - 1
EP - 20
JO - Microorganisms
JF - Microorganisms
IS - 8
M1 - 1613
ER -