A locally funded Puerto Rican parrot (Amazona vittata) genome sequencing project increases avian data and advances young researcher education
© Oleksyk et al.; licensee BioMed Central Ltd. 2012
Received: 14 November 2011
Accepted: 14 September 2012
Published: 28 September 2012
Amazona vittata is a critically endangered Puerto Rican endemic bird, the only surviving native parrot species in the United States territory, and the first parrot in the large Neotropical genus Amazona, to be studied on a genomic scale.
In a unique community-based funded project, DNA from an A. vittata female was sequenced using a HiSeq Illumina platform, resulting in a total of ~42.5 billion nucleotide bases. This provided approximately 26.89x average coverage depth at the completion of this funding phase. Filtering followed by assembly resulted in 259,423 contigs (N50 = 6,983 bp, longest = 75,003 bp), which was further scaffolded into 148,255 fragments (N50 = 19,470, longest = 206,462 bp). This provided ~76% coverage of the genome based on an estimated size of 1.58 Gb. The assembled scaffolds allowed basic genomic annotation and comparative analyses with other available avian whole-genome sequences.
The current data represents the first genomic information from and work carried out with a unique source of funding. This analysis further provides a means for directed training of young researchers in genetic and bioinformatics analyses and will facilitate progress towards a full assembly and annotation of the Puerto Rican parrot genome. It also adds extensive genomic data to a new branch of the avian tree, making it useful for comparative analyses with other avian species. Ultimately, the knowledge acquired from these data will contribute to an improved understanding of the overall population health of this species and aid in ongoing and future conservation efforts.
KeywordsAmazona vittata Puerto rican parrot Genome sequence Annotation Assembly Local funding Education
Average coverage of the Puerto Rican parrot genome in the current study based on the predicted genome size of 1.58Gb []
(~300 bp inserts)
(~2.5 kbp inserts)
Results of the genome assembly by Ray []
≥ 100 nt
≥ 500 nt
Scaffolds mapped to:
% of the scaffold
% of the scaffold
G. gallus genome only
T. guttata genome only
G. gallus and T. guttata
RepeatMasker software (http://www.repeatmasker.org) was used to search scaffolds for the presence of the known repeat classes with known repeats found on 59% of the scaffolds (see Annotation in Additional file 1). In addition, we used manual annotation, both by annotation scaffolds for gene and repeat elements and by annotating known genes, to validate high-throughput annotation, and using this, we designed and carried out a student development program (see Genome Annotation and Education in Additional file 1).
Comparative analyses of the A. vittata scaffolds against the chicken (Gallus gallus)  and zebra finch (Taeniopygia guttata)  genomes using local BLAST  resulted in 93.4 Mbp of total length of alignments to the chicken genome with 82.7% identity on average (average bit score 577.3), and 41.7 Mbp of total length of alignments to the zebra finch genome with 84.5% identity on average (average bit score 431.1).
In summary, these data represent the first assembly of a genome sequence for a parrot endemic to the United States, and also the first genome of a species from the diverse and ecologically important genus, Amazona, native to South America and the Caribbean. The assembled sequence provides a starting point towards completing and annotating a draft genome sequence. The data available at this coverage will be helpful in designing the future sequencing efforts, and can also be used for annotation and comparative genomic studies across the growing amount of avian genome data [5, 6, 8], which is essential given the growing rate of extinction among avian species worldwide.
Availability of supporting data
The raw reads are available at the ENA (accession #PRJEB225). Scaffolds and the assembly parameters have been submitted to the GenBank (accession #PRJNA171587), and all data, including FASTA files of contigs, scaffolds, corresponding assembly parameters, and annotation data are available in Giga DB . The links to all the supplementary tables and databases are listed in (Additional files 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16) and can also be accessed at http://genomes.uprm.edu/gigascience/SupplementaryTables/.
Note from the editors
A related commentary by Stephen O’Brien on the issues surrounding this work is published alongside this article .
First, we want to thank the people of Puerto Rico for their generous support of our initiative in the form of hundreds of individual donations to the Puerto Rican Parrot Genome Project. Additional support came from U.S. Fish and Wildlife Service (US FWS) grant #F11AP00196, and from a donation by Fundación Toyota de Puerto Rico. We thank the US FWS and the Compañía de Parques Nacionales de Puerto Rico for their assistance in obtaining samples. We thank College of Arts and Sciences of the University of Puerto Rico at Mayaguez for supporting the project and to dozens of undergraduate students from the Biology Department for contributing their time. We thank Stephen J O’Brien, Juan A Rivero, Juan Lopez-Garriga, Steven E Massey, Fernando Bird, Nanette Diffoot, Susan Soltero, Jennifer Bae, Mathew Landers, April Matisz, and Audrey J Majeske for helpful ideas, discussions, and help at different stages of the project. Finally, we thank the business community of Rincon, Puerto Rico, especially to Mr. Jim Behr and Ms. Rhea Maxwell for help with promoting the collection of funds.
- Tiersch TR, Wachtel SS: On the evolution of genome size of birds. J Hered. 1991, 82 (5): 363-368.PubMed
- Boisvert S, Laviolette F, Corbeil J: Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. Journal of computational biology: a journal of computational molecular cell biology. 2010, 17 (11): 1519-1533. 10.1089/cmb.2009.0238.View Article
- Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20 (2): 265-272. 10.1101/gr.097261.109.PubMed CentralView ArticlePubMed
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215 (3): 403-410.View ArticlePubMed
- International Chicken Genome Sequencing C: Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature. 2004, 432 (7018): 695-716. 10.1038/nature03154.View Article
- Warren WC, Clayton DF, Ellegren H, Arnold AP, Hillier LW, Kunstner A, Searle S, White S, Vilella AJ, Fairley S: The genome of a songbird. Nature. 2010, 464 (7289): 757-762. 10.1038/nature08819.PubMed CentralView ArticlePubMed
- Krzywinski MI, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: An information aesthetic for comparative genomics. Genome Res. 2009, 19 (9): 1639-45. 10.1101/gr.092759.109.PubMed CentralView ArticlePubMed
- Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED: Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol. 2012, 30 (7): 693-700. 10.1038/nbt.2280.PubMed CentralView ArticlePubMed
- Oleksyk TK, Guiblet W, Pombert JF, Valentin R, Martinez-Cruzado JC: Genomic data of the Puerto Rican Parrot (Amazona vittata) from a locally funded project. GigaScience. 2012, http://dx.doi.org/10.5524/100039,
- O’Brien SJ: Genome empowerment for the Puerto Rican parrot – Amazona vittata. GigaScience. 2012, 1: 13-PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.