High-coverage sequencing and annotated assembly of the genome of the Australian dragon lizard Pogona vitticeps
- Arthur Georges†1Email author,
- Qiye Li†2, 3,
- Jinmin Lian2,
- Denis O’Meally1,
- Janine Deakin1,
- Zongji Wang2, 4,
- Pei Zhang2,
- Matthew Fujita7,
- Hardip R. Patel6,
- Clare E. Holleley1,
- Yang Zhou2,
- Xiuwen Zhang1,
- Kazumi Matsubara1,
- Paul Waters5,
- Jennifer A. Marshall Graves1, 8,
- Stephen D. Sarre1 and
- Guojie Zhang2, 9Email author
© Georges et al. 2015
Received: 20 June 2015
Accepted: 14 September 2015
Published: 28 September 2015
The lizards of the family Agamidae are one of the most prominent elements of the Australian reptile fauna. Here, we present a genomic resource built on the basis of a wild-caught male ZZ central bearded dragon Pogona vitticeps.
The genomic sequence for P. vitticeps, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (179X raw read depth) from 13 insert libraries ranging from 250 bp to 40 kbp. After filtering for low-quality and duplicated reads, 146 Gbp of data (83X) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % of single nucleotide polymorphisms plus sequence insertions or deletions) complicated assembly; nevertheless, 96.4 % of reads mapped back to the assembled scaffolds, indicating that the assembly included most of the sequenced genome. Length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp. N50 was 2.29 Mbp. Genes were annotated on the basis of de novo prediction, similarity to the green anole Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and P. vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63 % of which had intact open reading frames. Our assembly captured 99 % (246 of 248) of core CEGMA genes, with 93 % (231) being complete.
The quality of the P. vitticeps assembly is comparable or superior to that of other published squamate genomes, and the annotated P. vitticeps genome can be accessed through a genome browser available at https://genomics.canberra.edu.au.
KeywordsPogona vitticeps Dragon lizard Central bearded dragon Agamidae, Squamata, Next-generation sequencing
The central bearded dragon, Pogona vitticeps, is widespread through the arid and semi-arid regions of eastern central Australia. This lizard adapts readily to captivity, lays large clutches of eggs several times per season, and is kept as a favoured pet species in Europe, Asia and North America. The karyotype of P. vitticeps is typical of most Australian agamids, consisting of six pairs of macrochromosomes and ten pairs of microchromosomes (2n = 32) . The sex determining mechanism is one of female heterogamety (ZZ/ZW) and the sex chromosomes are a pair of microchromosomes . Sex determination, a primary driver for our interest in generating this genome sequence, is complex in this species, involving an interaction between the influences of incubation environment and the ZZ/ZW genotype [3, 4].
Samples and sequencing
Summary of sequencing data derived from paired-end sequencing of 13 insert libraries using an Illumina HiSeq 2000 platform
Insert size (bp)
Nunber of libraries
Read length (bp)
Raw data (Gbp)
Average read depth (X)
Physical coverage (X)
Read length (bp)
Filtered data (Gbp)
Average read depth (X)
Physical coverage (X)
Statistics for the assembly contigs and scaffolds (after gap filling)
Reads from small-insert libraries that satisfied our filtering criteria were aligned to the assembly using the Burrows-Wheeler Aligner (BWA, version 0.5.9-R16) , allowing for eight mismatches and one indel. Of the total number of reads (797.4 M), 96.4 % could be mapped back to the assembled genome and they covered 98.4 % of the assembly excluding gaps. Bases in the assembled scaffolds had, on average, reads mapped with 55X read depth. These data suggest that we have assembled most of the P. vitticeps genome. In addition, we used the CEGMA package (version 2.4)  to map 248 core eukaryotic genes to our P. vitticeps assembly. Our assembly captured 99 % (246 of 248) of the core CEGMA genes, with 93 % (231) being complete. This is a higher assembly rate than that estimated for the green anole Anolis carolinensis assembly (AnoCar2.0), which captured 93.6 % (232) of the core genes, with 85.9 % (213) being complete.
Number of predicted genes with RNA-seq signals
Specimen ID (tissue ID)
The statistics for repeats in the P. vitticeps genome annotated by different methods
Total repeat length (bp)
Percentage of genome
Tandem Repeats Finder
Breakdown of repeat content of the Pogona vitticeps genome derived from RepeatMasker analysis
% of genome
% of genome
% of genome
% of genome
We combined homology-based, de novo and transcriptome-based methods to predict gene content of the assembly. In the homology-based prediction, the assembly was annotated by generating reference sets of A. carolinensis, Gallus gallus and Homo sapiens proteins, and aligning the reference sets to the assembly using TBLASTN (version 2.2.23; E-value ≤ 1 × 10−5). The resultant homologous genome sequences were then aligned against matching proteins using Genewise (version wise2-2-0)  to define gene models. In the de novo prediction, we randomly selected 1,000 genes with intact open reading frames (ORFs) as predicted by the homology-based approach to train the Augustus gene prediction tool (version 2.5.5)  with the parameters appropriate to P. vitticeps. The de novo gene prediction was then performed with Augustus applied to the genome after repeat sequences were masked as described above. In the transcriptome-based approach, we mapped transcriptome reads to the assembly using TopHat (version 1.3.1) , which can align reads across splice junctions. These mapped reads were assembled into transcripts using Cufflinks (version 1.3.0)  and then merged across samples (n = 7, Table 3) into a single transcriptome annotation using the Cuffmerge option.
Characteristics of predicted protein-coding genes in the Pogona vitticeps assembly and comparison with Anolis carolensis, Gallus gallus and Homo sapiens
Single exon gene
Gene length (bp)
mRNA length (bp)
Exons per gene
Exon length (bp)
Intron length (bp)
De novo (Augustus)
To assign gene names to each predicted protein-coding locus, we mapped the 19,406 genes to an Ensembl library collated from A. carolinensis, chicken G. gallus, human H. sapiens, western clawed frog Xenopus tropicalis and zebrafish Danio rerio. The name associated with the best hit for each P. vitticeps gene was assigned to each of 19,083 genes. Most of these genes (16,510) mapped to a homolog even at high stringency (>80 % of protein length aligned).
Bacterial artificial chromosome library
A large-insert genomic DNA bacterial artificial chromosome (BAC) library was constructed from DNA from a wild-caught female dragon lizard (TC1542) confirmed to have the ZW genotype using sex-linked PCR markers [3, 4] and cytologically . The library is estimated to represent 6.3× of genome coverage, and is comprised of 92,160 clones with an average insert size of 120 kbp. This resource is commercially available through Amplicon Express (Pullman, WA, USA; http://ampliconexpress.com).
Anchoring sequences to chromosomes
Our previously published cytogenetic map of P. vitticeps consisted of 87 BACs that were mapped to the macrochromosomes (64 BACs) and microchromosomes (23 BACs) . We mapped an additional 80 BACs, extending the set to 125 markers on macrochromosomes and 42 on microchromosomes. Sequence scaffolds were anchored to chromosomes by 52 loci, contained in the BACs, that are conserved in homologous syntenic blocks across amniotes (A. carolinensis, G. gallus, H. sapiens). By using gene synteny information 37.9 % (670 Mbp) of the sequenced genome has been assigned to chromosomes (Deakin et al., unpublished data).
Sex chromosome sequences
The sex of P. vitticeps is determined by a combination of chromosomal constitution and influence of environmental temperature on the developing embryo. P. vitticeps has female heterogamety (with ZZ male and ZW female individuals), and the Z and W chromosomes are among the ten pairs of microchromosomes . Sex chromosome heteromorphy is evident by C-banding, but the degree of differentiation of the Z and W chromosomes is slight . The sex chromosomes of P. vitticeps are not homologous to the sex chromosomes of chicken (G. gallus) or other reptiles so far examined . The ZZ genotype is reversed to a female phenotype at high incubation temperatures [3, 4].
Our laboratory has previously identified a sex-linked sequence using amplified fragment length polymorphism screening and genome walking [4, 23]. Five contiguous BAC clones containing sex-linked markers that map to the sex chromosome pair were sequenced to reveal 352 kbp of P. vitticeps sex chromosome sequence . This region contained five protein-coding genes (oprd1, rcc1, znf91, znf131 and znf180) and several major families of repetitive sequences (long terminal repeat [LTR] and non-LTR retrotransposons, including chicken repeat 1 [CR1] and bovine B LINEs [Bov-B LINEs]) [1, 24].
More recently, we amplified micro-dissected W-chromosome fragments to yield many sex chromosome sequence tags that were reciprocally mapped to their Z homologs (Matsubara et al., unpublished data). All putative sex chromosome scaffolds were confirmed to co-localize with the known ZW-BAC Pv3-L07 when physically mapped (Deakin et al., unpublished data). In this way we identified 12.8 Mbp of the Z chromosome (on three scaffolds) and increased the number of confirmed sex chromosome genes to 240 (Deakin et al., unpublished data).
GC content and isochore structure
Comparison of mean GC content for available tetrapod genomes
P. vitticeps - microchromosomes
P. vitticeps - macrochromosomes
P. vitticeps - Z chromosome
The macrochromosomes of P. vitticeps are largely devoid of variation in GC content at small (5 kbp) spatial scales. In fact, P. vitticeps macrochromosomes are more uniform in terms of GC distribution than is the A. carolinensis genome (standard deviation 0.029 versus 0.032 respectively; Table 7). With the exception of the Z microchromosome, P. vitticeps microchromosomes possess a heterogeneous distribution of GC-rich sequences over 5 kbp windows (Fig. 4a). In this regard, P. vitticeps microchromosomes resemble those of birds but differ markedly from those of A. carolinensis, whose GC content more closely resembles that of the macrochromosomes . Intriguingly, the Z microchromosome of P. vitticeps has an average GC content comparable to that of coding sequences and short interspersed nuclear elements (SINEs) (Fig. 4b), which suggests that this chromosome may be enriched in these GC-rich components of the genome. However, with the exception of LTR transposable elements, all components (CDS, introns, tandem repeats and transposable elements) showed greater GC content if they resided on the Z chromosome than elsewhere (Z chromosome > microchromosomes > macrochromosomes; Fig. 4c), suggesting that there are other, as yet unidentified, reasons for the observed variation in GC content across different chromosome classes.
When variation in GC distribution is considered over larger spatial scales (tens to hundreds of kbp, Fig. 3), the P. vitticeps macrochromosomes are similar to the A. carolinensis genome, which lacks substantial variation in GC composition, a striking departure from isochore patterns seen in mammals and birds . The Z chromosome, too, lacks substantial heterogeneity over larger spatial scales, which perhaps reflects the uniform distribution of repetitive elements along its length . Only the autosomal microchromosomes of P. vitticeps bear any similarity in GC distribution to the other sauropsid genomes examined. The P. vitticeps genome, therefore, has compositional patterns distinct from that of A. carolinensis, which indicates that different processes have shaped the genomes of the two lizards since they shared a common ancestor 144 million years ago.
Comparison with other assemblies
Comparison of sequencing platform, assembler, and assembly statistics for the reptiles for which a genome sequence is available
Chinese softshell turtle
Green sea turtle
Western painted turtle
CLC NGS Cell (version 2011)
AllPaths (version R41313)
Celera Assembler (version 5.4)
Arachne (version 3.0.0)
Allpaths (version R41313)a
Illumina HiSeq 2000
Illumina GAIIx & HiSeq 2000, Roche 454
Illumina GAII & HiSeq 2000
Sanger, Roche 454
Illumina GAII & HiSeq 2000
Illumina HiSeq 2000
Illumina HiSeq 2000
Roche 454, Illumina, Sanger
Average read depth
Genome size (Gbp)
Total bases in contigs (excluding unknown bases, Ns)
Total bases in scaffolds
No. of scaffolds (>100 bp)
N50 scaffold (kbp)
No. of contigs (>100 bp)
N50 contig (kbp)
No. protein-coding genes
The quality of the P. vitticeps assembly is comparable to that of other published squamate genomes. This genome assembly, coupled with the availability of a BAC library and the development of a high-density physical map for each chromosome, provides an unparalleled resource for accelerating research on sex determination, major histocompatibility complex evolution, and the evolution of adaptive traits in squamates to complement the advances brought about by the sequencing of the A. carolinensis genome .
Availability of supporting data
SoapDeNovo2 pvi1.1.Jan2013 genome assembly (ENA accession number ERZ094017)
Trinity de novo transcriptome assembly (ENA accession number ERZ097159)
Peptide and coding sequences for the pvi1.1.Jan2013 assembly
Gene annotations and repeat annotations for the scaffolds
Sequence Read Archive accession numbers for all sequencing runs.
The annotated P. vitticeps genome sequence can be accessed through a publicly available genome browser .
Bacterial artificial chromosome
European Nucleotide Archive
Sequence insertion or deletion
Short sequence of length k
Long interspersed nuclear element
Long terminal repeat
50 % of the genome sequence is contained in contigs (or scaffolds) equal to or greater than this length
Open reading frame
Short interspersed nuclear element
Single nucleotide polymorphism
We are indebted to the BGI-Shenzhen, China, for its contribution to the sequencing of the P. vitticeps genome and subsequent bioinformatics work. The Institute for Applied Ecology at the University of Canberra, Australia provided access to resources and facilities. The Institute, the Faculty of Applied Science and the Office of the PVC Research, University of Canberra, and China National GeneBank-Shenzhen contributed funding in support of this project. Tariq Ezaz provided advice and supervision of the cytogenetic work undertaken by Kazumi Matsubara.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Young MJ, O’Meally D, Sarre SD, Georges A, Ezaz T. Molecular cytogenetic map of the central bearded dragon Pogona vitticeps (Squamata: Agamidae). Chromosom Res. 2013;21:361–74.View ArticleGoogle Scholar
- Ezaz T, Quinn AE, Miura I, Sarre SD, Georges A, Graves JAM. The dragon lizard Pogona vitticeps has ZZ/ZW micro-sex chromosomes. Chromosom Res. 2005;13:763–76.View ArticleGoogle Scholar
- Holleley CE, O’Meally D, Sarre SD, Graves JAM, Ezaz T, Matsubara K, et al. Sex reversal triggers the rapid transition from genetic to temperature dependent sex. Nature. 2015;523:79–82.View ArticlePubMedGoogle Scholar
- Quinn AE, Georges A, Sarre SD, Guarino F, Ezaz T, Graves JAM. Temperature sex reversal implies sex gene dosage in a reptile. Science. 2007;316:411.View ArticlePubMedGoogle Scholar
- MacCulloch RD, Upton DE, Murphy RW. Trends in nuclear DNA content among amphibians and reptiles. Comp Biochem Physiol. 1996;113B:601–5.View ArticleGoogle Scholar
- Doležel J, Bartoš J, Voglmayr H, Greilhuber J. Nuclear DNA content and genome size of trout and human. Cytometry. 2003;51A:127–8.View ArticleGoogle Scholar
- Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–70.View ArticlePubMedPubMed CentralGoogle Scholar
- Li R, Fan W, Tian G, Zhu H, He L, Cai J, et al. The sequence and de novo assembly of the giant panda genome. Nature. 2010;463:311–7.View ArticlePubMedGoogle Scholar
- Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010;20:265–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Parra G, Bradnam K, Korf I. CEGMA: A pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007;23:1061–7.View ArticlePubMedGoogle Scholar
- Grabherr M, Haas B, Yassour M, Levin J, Thompson D, Amit I, et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110:462–7.View ArticlePubMedGoogle Scholar
- Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0 1996–2010. Institute for Systems Biology, Seattle, WA, USA. 1996. http://www.repeatmasker.org. Accessed 20-Dec-14 2014.
- Smit AFA, Hubley R. RepeatModeler Open-1.0. 2008–2015. . Institute for Systems Biology, Seattle, WA, USA. 2008. http://www.repeatmasker.org. Accessed 20-Dec-2014 2014.
- Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Benson G. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.View ArticlePubMedPubMed CentralGoogle Scholar
- Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res. 2004;14:988–95.View ArticlePubMedPubMed CentralGoogle Scholar
- Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19 Suppl 2:215–25.Google Scholar
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–11.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Ezaz T, Moritz B, Waters PD, Graves JAM, Georges A, Sarre SD. The ZW sex microchromosomes of an Australian dragon lizard share no homology with those of other reptiles or birds. Chromosom Res. 2009;17:965–73.View ArticleGoogle Scholar
- Quinn AE, Ezaz T, Sarre SD, Graves JAM, Georges A. Extension, single-locus conversion and physical mapping of sex chromosome sequences identify the Z microchromosome and pseudo-autosomal region in a dragon lizard. Pogona vitticeps Heredity. 2010;104:410–7.View ArticlePubMedGoogle Scholar
- Ezaz T, Azad B, O’Meally D, Young MJ, Matsubara K, Edwards MJ, et al. Sequence and gene content of a large fragment of a lizard sex chromosome and evaluation of candidate sex differentiating gene R-spondin1. BMC Genomics. 2013;14:899.View ArticlePubMedPubMed CentralGoogle Scholar
- Alfoldi J, di Palma F, Grabherr M, Williams C, Kong L, Mauceli E, et al. The genome of the green anole lizard and a comparative analysis with birds and mammals. Nature. 2011;477:587–91.View ArticlePubMedPubMed CentralGoogle Scholar
- Castoe T, de Koning A, Hall K, Card D, Schield D, Fujita M, et al. The Burmese python genome reveals the molecular basis for extreme adaptation in snakes. Proc Natl Acad Sci U S A. 2013;110:20645–50.View ArticlePubMedPubMed CentralGoogle Scholar
- Vonk F, Casewell N, Henkel C, Heimberg A, Jansen H, McCleary R, et al. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system. Proc Natl Acad Sci U S A. 2013;110:20651–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Shaffer H, Minx P, Warren D, Shedlock A, Thomson R, Valenzuela N, et al. The western painted turtle genome, a model for the evolution of extreme physiological adaptations in a slowly evolving lineage. Genome Biol. 2013;14:R28.View ArticlePubMedGoogle Scholar
- Wang Z, Pascual-Anaya J, Zadissa A, Li W, Niimura Y, Huang Z, et al. The draft genomes of soft-shell turtle and green sea turtle yield insights into the development and evolution of the turtle-specific body plan. Nat Genet. 2013;45:701–6.View ArticlePubMedPubMed CentralGoogle Scholar
- St John J, Braun E, Isberg S, Miles L, Chong A, Gongora J, et al. Sequencing three crocodilian genomes to illuminate the evolution of archosaurs and amniotes. Genome Biol. 2012;13:415.View ArticlePubMedPubMed CentralGoogle Scholar
- Kirkness EF, Bafna V, Halpern AL, Levy S, Remington K, Rusch DB, et al. The dog genome: survey sequencing and comparative analysis. Science. 2003;301:1898–903.View ArticlePubMedGoogle Scholar
- Hellsten U, Harland RM, Gilchrist MJ, Hendrix D, Jurka J, Kapitonov V, et al. The genome of the Western clawed frog Xenopus tropicalis. Science. 2010;328:633–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Fujita MK, Edwards SV, Ponting CP. The Anolis lizard genome: An amniote genome without isochores. Genome Biol Evol. 2010;3:974–84.View ArticleGoogle Scholar
- Georges A, Li Q, Lian J, O’Meally D, Deakin J, Wang Z et al.. Genome of the Australian dragon lizard Pogona vitticeps. 2015. GigaScience Database. http://gigadb.org/dataset/100166.
- Georges A, O’Meally D, Genomics@UC. The Pogona vitticeps genome browser (pvi1.1 Jan 2013). Institute for Applied Ecology. Canberra: University of Canberra; 2015. https://genomics.canberra.edu.au/. Accessed 1-Sep-2015.Google Scholar