Algorithm
|
Complexity
|
Running time
|
---|
|
Time
|
Space
|
Drosophila
|
Human
|
---|
OPTIMA
|
O((m−c) δ3
#seeds)
|
O((m−c)2+c
n)
|
54 m
|
36 days
|
Gentig v.2 (d)
|
O(#it
m
δ3
#hashes)
|
O(m2+n+|HashTable|)
|
1.32 h
|
75 days
|
Gentig v.2 (tp)
| | |
1.85 h
|
174 days
|
SOMA v.2 (v)
|
O(m2
n2)
|
O(m
n)
|
1.28 years
|
1,067 years
|
Likelihood (d+a)
|
O(m
n
δ2)
|
O(m
n)
|
22.22 h
|
2.72 years
|
Likelihood (d+a+t)
| | |
19.62 h
|
2.38 years
|
Likelihood (p+a+t)
| | |
41.73 h
|
5.53 years
|
- Running times reported are estimated from 2100 maps and extrapolated for the full datasets (82,000 Drosophila maps and 2.1 million human maps, for 100 × coverage; single-core computation on Intel x86 64-bit Linux workstations with 16GB RAM). The best column-wise running times are reported in bold. Note that including the permutation-based statistical tests for SOMA and the likelihood method would increase their runtime by a factor of greater than 100. The complexity analysis refers to map-to-sequence glocal alignment per map, where n is the total length of the in silico maps (\(\thicksim \)500,000 fragments for the human genome), m≪n is the length of the experimental map in fragments (typically 17 fragments on average), #seeds, c (default of two) and δ are as defined in the “Methods” section and #it (number of iterations), #hashes (geometric hashes found to match) and |HashTable| are as specified in [17, 24]