Skip to main content
Figure 1 | GigaScience

Figure 1

From: VirAmp: a galaxy-based viral genome assembly pipeline

Figure 1

VirAmp pipeline overview. The diagram illustrates the progression of the VirAmp pipeline. A) First, we perform a quality trimming of the raw data, then reduce extremely high coverage data (top trace, red) to a reasonable depth and even out the coverage variation (bottom trace, blue; usually to ~100x). B) Next, a multi-step semi-de novo strategy is applied for core assembly: (I) a de novo assembler is run multiple times using different k-mer sizes, to assemble the short sequence reads into a set of long contigs; (II) contigs from different k-mer sets are oriented by aligning to the reference genome and then are connected into scaffolds based on the pairwise alignment. C) Data from the spacing of paired-end reads is used to extend the contigs, potentially closing gaps and/or joining contigs into larger scaffolds. D) Multiple tools are implemented for assembly evaluation and analysis of variation. These include basic assembly statistics, comparison of the new assembly to a reference genome, and identification of SNPs and repeats.

Back to article page