A genome draft of the legless anguid lizard, Ophisaurus gracilis
© Song et al.; licensee BioMed Central. 2015
Received: 25 December 2014
Accepted: 24 March 2015
Published: 9 April 2015
Transition from a lizard-like to a snake-like body form is one of the most important transformations in reptilian evolution. The increasing number of sequenced reptilian genomes is enabling a deeper understanding of vertebrate evolution, although the genetic basis of the loss of limbs in reptiles remains enigmatic. Here we report genome sequencing, assembly, and annotation for the Asian glass lizard Ophisaurus gracilis, a limbless lizard species with an elongated snake-like body form. Addition of this species to the genome repository will provide an excellent resource for studying the genetic basis of limb loss and trunk elongation.
O. gracilis genome sequencing using the Illumina HiSeq2000 platform resulted in 274.20 Gbp of raw data that was filtered and assembled to a final size of 1.78 Gbp, comprising 6,717 scaffolds with N50 = 1.27 Mbp. Based on the k-mer estimated genome size of 1.71 Gbp, the assembly appears to be nearly 100% complete. A total of 19,513 protein-coding genes were predicted, and 884.06 Mbp of repeat sequences (approximately half of the genome) were annotated. The draft genome of O. gracilis has similar characteristics to both lizard and snake genomes.
We report the first genome of a lizard from the family Anguidae, O. gracilis. This supplements currently available genetic and genomic resources for amniote vertebrates, representing a major increase in comparative genome data available for squamate reptiles in particular.
KeywordsLizard genome Anguidae Squamate reptiles Limblessness
Ophiosaurous gracilis genomic DNA was extracted from the tail of a single male lizard collected from the Tibetan Plateau and used to construct seven paired-end Illumina libraries with insert sizes ranging from 180 bp to 20 kbp. To construct small-insert libraries (180, 500, and 800 bp), DNA was sheared to the target size range using Covair S2 (Covaris, Woburn, MA, USA) and ligated to adaptors. For long-insert libraries (2, 5, 10, and 20 kb), DNA was fragmented using a Hydroshear system (Digilab, Marlborough, MA, USA). Sheared fragments were biotin labelled at the ends and fragments of the desired size were gel purified. A second round of fragmentation was then conducted before adapter ligation. Both libraries were sequenced on an Illumina HiSeq2000 Genome Analyzer (Illumina, San Diego, CA, USA), with 100 bp and 90 bp sequencing for short insert size libraries (180–800 bp) and large insert size libraries (2–20 kbp), respectively. A total of 274.20 Gbp of raw data was generated, from which 147.08 Gbp of ‘clean’ data was obtained after removal of duplicates, contaminated reads (reads with adaptor sequences), low quality reads (with Solexa quality scores (Phred64) of less than 7 for >60% and >80% of bases for short-insert libraries and long-insert libraries, respectively) and reads with more than 10% ‘N’ bases. The O. gracilis genome size was estimated to be approximately 1.71 Gbp using a k-mer-based approach . Based on this estimate, the clean data corresponds to approximately 86-fold coverage of the O. gracilis genome. High-quality reads were used for genome assembly (contig and scaffold construction) and gap closure was performed using the SOAPdenovo package and default parameters except that the k-mer size was set at 63 . The final assembly had a total length of 1.78 Gbp, comprising 6,715 scaffolds assembled from 135,863 contigs, with the longest scaffold size being 6.68 Mbp. The N50 sizes for contigs and scaffolds were 23.41 kbp and 1.27 Mbp, respectively. Given the genome size estimate of 1.71 Gbp, genome coverage by the final assembly was probably complete, although this is probably a slight overestimate due to possible overlaps between some of the scaffolds and/or misassembly of some heterozygous alleles. Completeness of the assembly was confirmed by the successful mapping of up to 97% of reads from short insert libraries. Collectively, this data indicates that almost complete O. gracilis genome coverage was obtained.
Global statistics of the O. gracilis genome
Scaffold N50 (Mb)
Average gene length (bp)
Average intron number
Average intron length (bp)
Average exon length (bp)
Summary of mobile element types
Percentage of genome (%)
In summary, we report the first annotated anguid lizard genome sequence assembly, to supplement the existing amniote genome resources in which squamate reptile sequences are sparsely represented. Despite the distant phylogenetic relationship , the morphology of the Asian glass lizard O. gracilis is highly convergent with that of snakes, including the lack of limbs and an elongated body. We therefore expect the genome of this species to be particularly useful for future comparative genomic analyses to identify the molecular basis of limb loss and body form evolution in squamate reptiles, and vertebrates in general.
Availability of supporting data
Supporting data is available in the GigaScience repository, GigaDB , and raw data in the SRP052050.
Expressed sequence tag
Long interspersed elements
Long terminal repeat
Short interspersed elements
This work was supported by grants from the Strategic Priority Research Program (B) (XDB13020200).
- Li R, Fan W, Tian G, Zhu H, He L, Cai J, et al. The sequence and de novo assembly of the giant panda genome. Nature. 2010;463:311–7.View ArticlePubMedGoogle Scholar
- Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience. 2012;1:36–41.View ArticleGoogle Scholar
- Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(2):ii215–25.PubMedGoogle Scholar
- Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protocol Bioinform. 2009;4:11–4.Google Scholar
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110:462–7.View ArticlePubMedGoogle Scholar
- Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nuc Acid Res. 1999;27:573–80.View ArticleGoogle Scholar
- Smit A, Hubley R. RepeatModeler-1.0.5. Institute for Systems Biology. 2012. http://www.repeatmasker.org/RepeatModeler.html. [Accessed]
- Alföldi J, Di Palma F, Grabherr M, Williams C, Kong L, Mauceli E, et al. The genome of the green anole lizard and a comparative analysis with birds and mammals. Nature. 2011;477:587–91.View ArticlePubMedPubMed CentralGoogle Scholar
- Castoe TA, De Koning AP, Hall KT, Card DC, Schield DR, Fujita MK, et al. The Burmese python genome reveals the molecular basis for extreme adaptation in snakes. Proc Natl Acad Sci U S A. 2013;110:20645–50.View ArticlePubMedPubMed CentralGoogle Scholar
- Vonk FJ, Casewell NR, Henkel CV, Heimberg AM, Jansen HJ, McCleary RJ, et al. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system. Proc Natl Acad Sci U S A. 2013;110:20651–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Pyron RA, Burbrink FT, Wiens JJ. A phylogeny and revised classification of squamata, including 4161 species of lizards and snakes. BMC Evol Biol. 2013;13:93.View ArticlePubMedPubMed CentralGoogle Scholar
- Song, B; Cheng, S; Sun, Y; Zhong, X; Jin, J; Guan, R; Murphy, RW; Che, J; Zhang, Y; Liu, X. (2015): Anguidae lizard (Ophisaurus gracilis) genome assembly data. GigaScience Database. http://doi.org/10.5524/100119
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.