Skip to main content

Table 2 Running time and worst-case complexity for various glocal map-to-sequence aligners

From: OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis

Algorithm Complexity Running time
  Time Space Drosophila Human
OPTIMA O((mc) δ3 #seeds) O((mc)2+c n) 54 m 36 days
Gentig v.2 (d) O(#it m δ3 #hashes) O(m2+n+|HashTable|) 1.32 h 75 days
Gentig v.2 (tp)    1.85 h 174 days
SOMA v.2 (v) O(m2 n2) O(m n) 1.28 years 1,067 years
Likelihood (d+a) O(m n δ2) O(m n) 22.22 h 2.72 years
Likelihood (d+a+t)    19.62 h 2.38 years
Likelihood (p+a+t)    41.73 h 5.53 years
  1. Running times reported are estimated from 2100 maps and extrapolated for the full datasets (82,000 Drosophila maps and 2.1 million human maps, for 100 × coverage; single-core computation on Intel x86 64-bit Linux workstations with 16GB RAM). The best column-wise running times are reported in bold. Note that including the permutation-based statistical tests for SOMA and the likelihood method would increase their runtime by a factor of greater than 100. The complexity analysis refers to map-to-sequence glocal alignment per map, where n is the total length of the in silico maps (\(\thicksim \)500,000 fragments for the human genome), mn is the length of the experimental map in fragments (typically 17 fragments on average), #seeds, c (default of two) and δ are as defined in the “Methods” section and #it (number of iterations), #hashes (geometric hashes found to match) and |HashTable| are as specified in [17, 24]