Figure 1From: A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing dataScaling relations for different stages of the pipeline executed on the HPC node. Good data partitioning in the manifest file yields a linearscaling for the preprocessing stage. However, the SNP calling stage does not scale linearly, which is the main reason of the sub-optimal load balancing implemented in Crossbow.Back to article page