Genome empowerment for the Puerto Rican parrot – Amazona vittata
© O’Brien; licensee BioMed Central Ltd. 2012
Received: 10 September 2012
Accepted: 14 September 2012
Published: 28 September 2012
A unique community-funded project in Puerto Rico has launched whole-genome sequencing of the critically endangered Puerto Rican Parrot (Amazona vittata), with interpretation by genome bioinformaticians and students, and deposition into public online databases. This is the first article that focuses on the whole genome of a parrot species, one endemic to the USA and recently threatened with extinction. It provides invaluable conservation tools and a vivid example of hopeful prospects for future genome assessment of so many new species. It also demonstrates inventive ways for smaller institutions to contribute to a field largely considered the domain of large sequencing centers.
KeywordsPuerto Rican parrot Whole-genome sequencing Genomics Conservation Education Funding
Perhaps one of the more gratifying aspects of the post-genomics era is marveling at the creativity of individual projects that push the envelope further and further over the edge. Witness the emergence of human “copy number variation” and discerning that their segmental aneuploidy might affect gene dosage and explain a few hereditary diseases (it does). Or 23andMe, the upstart SNP genotyping-for-the-people venture that began by predicting Oprah Winfrey’s curious ancestry and now is immersed in personal medical genomics disclosure for an affordable price. Or this month’s ENCODE bombshell that features some 4-million new gene regulatory sequence stretches amidst the sea of noncoding genomic DNA (98% of human DNA formerly dubbed “junk DNA”; well hardly!).
An article published alongside this paper in GigaScience this month unfolds yet another novel genomics-stimulated innovation — a unique grassroots endeavor to sequence the genome of a critically endangered species from a remote locale where the species survives and is empowered by the local citizenry who wanted to help. The Puerto Rican parrot’s genome has been sequenced and assembled; annotation has commenced and the fresh new data (29-fold Illumina coverage) sits in an open access GigaScience database, Giga DB and a genome browser for any party to query and improve. The work was led by Taras Oleksyk, Juan Carlo Martiez-Cruzado, along with a coterie of conservation minded scientists, and their students at the University of Puerto Rico-Mayaguez – not really a hotbed of genome sequencing centers.
The monitoring of these recovery releases in the wild populations as well as managing captive breeding programs will benefit considerably from the tools derived from this new genome sequence and annotation, particularly SNP, indel, and microsatellite variants that resolve kinship, historic migration, inbreeding, parentage, and dynamic population structure.
The genome of the Puerto Rican parrot was estimated at ~1.58 Gbp, about half the size of the human genome. The light 29x coverage reflected some 76% of the genome, and contigs assembled with Ray and SOAPdenovo were joined into scaffolds of modest size (N50 ~19.5 kb) using two insert libraries (300 bp and 2.5 kb inserts). The authors aligned their scaffolds to the more advanced genome assemblies of chicken and zebra finch (which are a long way away in evolutionary time, circa 90MY, as far as humankind has diverged from a common ancestor with mice). They annotated repeat families, but do not yet present a framework map to discriminate gene organization (vice zebra finch or chicken), a gene assessment, a description of SNP variation, a listing of microsatellite loci, Numts, microRNAs, endogenous retroviruses, nor other features that users of genome sequence thirst for. This work is a raw and preliminary effort, but a welcome starting gun for genomics and conservation communities to rapidly supply the finer genome feature details.
The achievement of the assembled Puerto Rican parrot genome is an important milestone for unusual reasons. First, the genome project was funded by student organized art and fashion shows dedicated to the effort plus scores of small personal donations by Puerto Rican people who wanted to be part of it. That could only happen when the cost of reagents had dropped so precipitously that it can be afforded within a $10,000 USD budget. Second, the analysis and annotation took place in a modest university setting where students of genome bioinformatics were trained to drive assemblers, to stitch together contigs and scaffolds, and to begin the genome annotation process. Third, the Puerto Rican parrot is a harbinger for the many parrot genomes we shall be seeing in the near future: the opportunity to explore speciation and adaptive radiation among island species of these parrots is too tempting to pass up (Figure1). The Genome 10 K sponsored Assemblathon-II (led by Ian Korf) (http://assemblathon.org) is evaluating assembly strategies for three vertebrate species: a cyclid fish, a python, and a parakeet songbird (Melopsittacus undulatus). And great things are expected from the BGI-Genome 10 K Avian phylogenomics consortium (led by Erich Jarvis) annotating the genomes of 48 plus avian species including more parrots, such as the Kea (Nestor notabilis) coming later in 2012.
It may be a telling coincidence that only a few weeks have passed since the genome sequence and assembly of Geospiza fortis: the iconic species better known as Darwin’s finch was accomplished (by the BGI), announced and released on the UC Santa Cruz Genome Browser (http://genome.ucsc.edu) prior to publication of the primary analysis paper to promote rapid data use. Evolutionary genomics has gotten a boost this month to be sure.
The genomics community has expanded of late from anthropocentric emphasis to enormous enthusiasm for the comparative genome sequence achievement of thousands of species. The Genome10K Project is poised to assist and facilitate the genome sequencing, assembly, and annotation of 10,000 vertebrate species in the near future. Ditto for the Insect 5 K project for 5,000 insect species, and the fledgling Global Invertebrate Genomics Alliance (GIGA) for invertebrate species. If these projects are successful, we will need genome-science bioinformaticians for 25,000 species rather soon. Will they be supplied by the traditional genome sequencing centers, by mega-sequencing centers as for the BGI, or by young scientists across the globe like those trained on the Puerto Rican parrot’s genome, poised to make sense, aka a comprehensive genome browser for each new species’ sequence? Time will tell, but I have my suspicions.
The author is Co-Director of the Genome 10K Project and Chief Scientific Officer of Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, Russia. He knows more about cat genomics than about bird genomics.
International Union for Conservation of Nature
I am grateful to Jose Almodovar and Taras Oleksyk for supplying Figure1 and to Klaus Peter Koepfli for comments on an early draft.
- Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M, ENCODE Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature. 2012, 489 (7414): 57-74. 10.1038/nature11247.View Article
- Oleksyk TK, Pombert JF, Guiblet W, Ramos B, Mazo A, Ruiz-Rodriguez CT, Nickerson ML, Afanador Y, Siu D, Valentin R, Figueroa L, Dean M, Logue DM, Martinez-Cruzado JC: A locally funded Puerto Rican parrot (Amazona vittata) genome sequencing project increases avian data and advances young researcher education. GigaScience. 2012, 1: 14-PubMed CentralView ArticlePubMed
- Oleksyk TK, Guiblet W, Pombert JF, Valentin R, Martinez-Cruzado JC: Genomic data of the Puerto Rican Parrot (Amazona vittata) from a locally funded project. GigaScience.http://dx.doi.org/10.5524/100039,
- Lacy RC, Flesness NR, Seal US, Ballou JD, Foose TJ, Bruning D, Dierenfeld E, Kollias GV, Snyder NFR, Wildt D: Puerto Rican parrotAmazona vittatapopulation viability analysis and recommendations.http://www.cbsg.org/cbsg/workshopreports/23/puerto_rican_parrot_pva_final_report_1989.pdf,
- Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, Phillippy AM: Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol. 2012,http://dx.doi.org/10.1038/nbt.2280,
- Song bird science comparative genomics.http://songbirdscience.com/resources/genomics/Comparative,
- Zhang G, Parker P, Li B, Li H, Wang J: The genome of Darwin’s finch (Geospiza fortis). GigaScience.http://dx.doi.org/10.5524/100040,
- Genome 10K Community of Scientists: A proposal to obtain whole-genome sequence for 10,000 vertebrate species. J Hered. 2009, 100 (6): 659-674.PubMed CentralView Article
- Robinson GE, Hackett KJ, Purcell-Miramontes M, Brown SJ, Evans JD, Goldsmith MR, Lawson D, Okamuro J, Robertson HM, Schneider DJ: Creating a buzz about insect genomes. Science. 2011, 331: 1386-10.1126/science.331.6023.1386.View ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.