Skip to main content


Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Figure 4 | GigaScience

Figure 4

From: A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data

Figure 4

The ratio of the F Hadoop /F HPC as a function of the reciprocal dataset size in Gb. The pipelines were run on the Hadoop I and II clusters, as well as a 16 core HPC node. The analytical curve f(x)=(a 1 x+b 1)/(a 2 x+b 2) was used to fit the data for the stretches of linear scaling of calculation time on the HPC platform. The outliers are marked with crossed symbols.

Back to article page