Improving the ostrich genome assembly using optical mapping data
© Zhang et al.; licensee BioMed Central. 2015
Received: 22 January 2015
Accepted: 19 April 2015
Published: 12 May 2015
The ostrich (Struthio camelus) is the tallest and heaviest living bird. Ostrich meat is considered a healthy red meat, with an annual worldwide production ranging from 12,000 to 15,000 tons. As part of the avian phylogenomics project, we sequenced the ostrich genome for phylogenetic and comparative genomics analyses. The initial Illumina-based assembly of this genome had a scaffold N50 of 3.59 Mb and a total size of 1.23 Gb. Since longer scaffolds are critical for many genomic analyses, particularly for chromosome-level comparative analysis, we generated optical mapping (OM) data to obtain an improved assembly. The OM technique is a non-PCR-based method to generate genome-wide restriction enzyme maps, which improves the quality of de novo genome assembly.
In order to generate OM data, we digested the ostrich genome with KpnI, which yielded 1.99 million DNA molecules (>250 kb) and covered the genome at least 500×. The pattern of molecules was subsequently assembled to align with the Illumina-based assembly to achieve sequence extension. This resulted in an OM assembly with a scaffold N50 of 17.71 Mb, which is 5 times as large as that of the initial assembly. The number of scaffolds covering 90% of the genome was reduced from 414 to 75, which means an average of ~3 super-scaffolds for each chromosome. Upon integrating the OM data with previously published FISH (fluorescence in situ hybridization) markers, we recovered the full PAR (pseudoatosomal region) on the ostrich Z chromosome with 4 super-scaffolds, as well as most of the degenerated regions.
The OM data significantly improved the assembled scaffolds of the ostrich genome and facilitated chromosome evolution studies in birds. Similar strategies can be applied to other genome sequencing projects to obtain better assemblies.
KeywordsOstrich Optical mapping Genome assembly
The advent of the next-generation sequencing (NGS) technology (e.g. Illumina HiSeq, SOLID, 454 FLX) has facilitated the new genome sequencing projects. However, the short reads produced by NGS limits the de novo assembly process to overcome the repeat-rich or highly heterozygous regions to obtain long scaffolds. Without long scaffolds, it is difficult or impossible to conduct some downstream analyses, such as chromosomal rearrangement analysis. One good method used to elongate the scaffolds is optical mapping (OM) , which estimates the gap length between scaffolds and merges them into much longer sequences without introducing new bases.
The flightless ostrich (Struthio camelus) is the tallest and heaviest living bird. It is the only member in the family Struthionidae, which is the basal extant member of Palaeognathae. Ostrich meat is considered healthy due to its high polyunsaturated fatty acid content, low saturated fatty acid content, and low cholesterol level. The worldwide production of ostrich meat is around 12,000 to 15,000 tons per year . Due to this bird’s biological and agricultural importance, the avian phylogenomics project sequenced the ostrich genome for phylogenetic  and comparative genomics analyses . Because ostrich is an important species for avian chromosome evolution analysis [5,6], we generated OM data to help improve the assembly.
Restriction enzymes evaluated for compatibility with the Ostrich genome
Usable % 5-20 kb
Usable % 6-12 kb
Usable % 6-15 kb
#Frags >100 kb
Avg. frag. size (kb)
Max. frag. size (kb)
All work done in this project followed the guidelines and protocols for research on animals and had the necessary permits and authorization. High molecular weight genomic DNA was extracted from a blood sample collected from a male ostrich in the Kunming Zoo of China. The DNA was then transferred to OpGen, Inc. for collection of single molecule restriction maps (SMRMs) on the Argus® Whole Genome Mapping System. The average size of the digested molecules was ~282 kb, which was determined to be sufficient. To further confirm the enzyme compatibility and performance, 3 MapCards were run to examine the average fragment size, the results of which were consistent with the expected outcome.
Summary of SMRM data
Maps of >250 kb
Number of molecules
Average molecule size
Minimum molecule size
Average fragment size
Summary of assemblies
In conclusion, the OM data generated in this study and presented here improved the ostrich assembly and facilitated a comparative analysis at the chromosome level. The improved assembly can be used for future genomic studies, especially those requiring long scaffolds. Furthermore, these data can be used for future development of OM software tools.
Availability of supporting data
The data files presented in this Data Note are available in the GigaScience repository, GigaDB . Raw sequencing data are also available from the SRA [SRP028745].
Single molecule restriction map
Fluorescence in situ hybridization
We thank Danqing Mao for performing the DNA extraction and Qiumei Zheng for arranging the sample delivery.
- Neely RK, Deen J, Hofkens J. Optical mapping of DNA: single-molecule-based methods for mapping genomes. Biopolymers. 2011;95:298–311.View ArticlePubMedGoogle Scholar
- Medina FX, Aguilar A. Ostrich meat: nutritional, breeding, and consumption aspects. The Case of Spain. J Food Nutr Res. 2014;2:301–5.View ArticleGoogle Scholar
- Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science. 2014;346:1320–31.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, et al. Comparative genomics reveals insights into avian genome evolution and adaptation. Science. 2014;346:1311–20.View ArticlePubMedPubMed CentralGoogle Scholar
- Romanov MN, Farre M, Lithgow PE, Fowler KE, Skinner BM, O’Connor R, et al. Reconstruction of gross avian genome structure, organization and evolution suggests that the chicken lineage most closely resembles the dinosaur avian ancestor. BMC Genomics. 2014;15:1060.View ArticlePubMedPubMed CentralGoogle Scholar
- Nishida-Umehara C, Tsuda Y, Ishijima J, Ando J, Fujiwara A, Matsuda Y, et al. The molecular basis of chromosome orthologies and sex chromosomal differentiation in palaeognathous birds. Chromosome Res. 2007;15:721–34.View ArticlePubMedGoogle Scholar
- Tsuda Y, Nishida-Umehara C, Ishijima J, Yamada K, Matsuda Y. Comparison of the Z and W sex chromosomal architectures in elegant crested tinamou (Eudromia elegans) and ostrich (Struthio camelus) and the process of sex chromosome differentiation in palaeognathous birds. Chromosoma. 2007;116:159–73.View ArticlePubMedGoogle Scholar
- Zhou Q, Zhang J, Bachtrog D, An N, Huang Q, Jarvis ED, et al. Complex evolutionary trajectories of sex chromosomes across bird taxa. Science. 2014;346:1246338.View ArticlePubMedGoogle Scholar
- Wang Z, Zhang J, Yang W, An N, Zhang P, Zhang G, et al. Temporal genomic evolution of bird sex chromosomes. BMC Evol Biol. 2014;14:250.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang G, Li B, Li C, Gilbert MTP, Ryder O, Jarvis ED, et al. Genomic data of the Ostrich (Struthio camelus australis). GigaScience Database. 2014. http://dx.doi.org/10.5524/101013
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.