Table 1 Datasets used in the comparison

From: A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data

Dataset Organism Size in Gb
I A.thaliana 1.4
S A.thaliana, the artificial dataset 100.0
  created using Samtools package