Skip to main content

Table 2 Running time and worst-case complexity for various glocal map-to-sequence aligners

From: OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis

Algorithm

Complexity

Running time

 

Time

Space

Drosophila

Human

OPTIMA

O((mc) δ3 #seeds)

O((mc)2+c n)

54 m

36 days

Gentig v.2 (d)

O(#it m δ3 #hashes)

O(m2+n+|HashTable|)

1.32 h

75 days

Gentig v.2 (tp)

  

1.85 h

174 days

SOMA v.2 (v)

O(m2 n2)

O(m n)

1.28 years

1,067 years

Likelihood (d+a)

O(m n δ2)

O(m n)

22.22 h

2.72 years

Likelihood (d+a+t)

  

19.62 h

2.38 years

Likelihood (p+a+t)

  

41.73 h

5.53 years

  1. Running times reported are estimated from 2100 maps and extrapolated for the full datasets (82,000 Drosophila maps and 2.1 million human maps, for 100 × coverage; single-core computation on Intel x86 64-bit Linux workstations with 16GB RAM). The best column-wise running times are reported in bold. Note that including the permutation-based statistical tests for SOMA and the likelihood method would increase their runtime by a factor of greater than 100. The complexity analysis refers to map-to-sequence glocal alignment per map, where n is the total length of the in silico maps (\(\thicksim \)500,000 fragments for the human genome), mn is the length of the experimental map in fragments (typically 17 fragments on average), #seeds, c (default of two) and δ are as defined in the “Methods” section and #it (number of iterations), #hashes (geometric hashes found to match) and |HashTable| are as specified in [17, 24]