TY - JOUR
T1 - Assembly of viral genomes from metagenomes
AU - Smits, Saskia L.
AU - Bodewes, Rogier
AU - Ruiz-Gonzalez, Aritz
AU - Baumgärtner, Wolfgang
AU - Koopmans, Marion P.
AU - Osterhaus, Albert D.M.E.
AU - Schürch, Anita
N1 - Publisher Copyright:
© 2014 Smits, Bodewes, Ruiz-Gonzalez, Baumgärtner, Koopmans, Osterhaus and Schürch.
PY - 2014
Y1 - 2014
N2 - Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity are, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.
AB - Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity are, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.
KW - Assembly
KW - Metagenome
KW - Pathogen
KW - Viral metagenomics
KW - Virome
KW - Virus
KW - Virus discovery
UR - http://www.scopus.com/inward/record.url?scp=84920678673&partnerID=8YFLogxK
U2 - 10.3389/fmicb.2014.00714
DO - 10.3389/fmicb.2014.00714
M3 - Article
AN - SCOPUS:84920678673
SN - 1664-302X
VL - 5
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
IS - DEC
M1 - 714
ER -