OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis

Table 2 Running time and worst-case complexity for various glocal map-to-sequence aligners

Running times reported are estimated from 2100 maps and extrapolated for the full datasets (82,000 Drosophila maps and 2.1 million human maps, for 100 × coverage; single-core computation on Intel x86 64-bit Linux workstations with 16GB RAM). The best column-wise running times are reported in bold. Note that including the permutation-based statistical tests for SOMA and the likelihood method would increase their runtime by a factor of greater than 100. The complexity analysis refers to map-to-sequence glocal alignment per map, where n is the total length of the in silico maps (\(\thicksim \)500,000 fragments for the human genome), m≪n is the length of the experimental map in fragments (typically 17 fragments on average), #seeds, c (default of two) and δ are as defined in the “Methods” section and #it (number of iterations), #hashes (geometric hashes found to match) and |HashTable| are as specified in [17, 24]

ISSN: 2047-217X