Skip to main content
Fig. 2 | GigaScience

Fig. 2

From: Tentacle: distributed quantification of genes in metagenomes

Fig. 2

Overview of the data analysis pipeline executed on each worker node. The workers perform transfer of compressed data files from the distributed file system (DFS), pre-processing (such as FASTQ quality filtering/trimming or removal of human sequences using Bowtie 2), read mapping, and counts/coverage calculation. Note that ambiguously mapped reads are not depicted in this figure; the user can choose whether to keep all of the mapped reads or keep only the best match. Worker nodes fetch their data independently of the master process, thus minimizing the risk for data transfer bottlenecks. It is possible to instruct Tentacle to retrieve mapping results from worker nodes after mapping is completed. Counts/coverage calculations can be disabled, which when combined with retrieval of mapping results effectively transforms Tentacle into a parallel mapping framework

Back to article page