A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data for the black truffle Tuber melanosporum
- Pao-Yang Chen†1, 2,
- Barbara Montanini†3,
- Wen-Wei Liao1,
- Marco Morselli2, 3,
- Artur Jaroszewicz2,
- David Lopez2,
- Simone Ottonello3Email author and
- Matteo Pellegrini2Email author
© Chen et al.; licensee BioMed Central Ltd. 2014
Received: 30 July 2014
Accepted: 16 October 2014
Published: 30 October 2014
Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity.
We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes.
The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
KeywordsDNA methylation Tuber melanosporum Ascomycete truffle Pezizomycetes Transposable elements Whole-genome bisulfite sequencing Methylome Copy number variation Transposon expression 5-azacytidine Genome plasticity
Purpose of data acquisition
Genome-wide profiles of DNA methylation have recently emerged from methylome studies carried out on more than 20 eukaryotic organisms belonging to four different lineages [1–5]. Promoter and gene-body methylation marks are common in higher eukaryotes where they provide an additional level of gene regulation, while inactivation of transposon and other repeated elements appears to be the main purpose of DNA methylation in fungi (reviewed by ).
The black truffle (T. melanosporum) is a macrofungus and a highly appreciated gastronomic delicacy produced by an ectomycorrhizal ascomycetous symbiont found throughout southern Europe. It features one the largest genomes (125 Mb), with the highest transposable element (TE) and repetitive DNA content (>58%), amongst fungi that have been sequenced so far . As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs, has been proposed to be a major force driving the evolution of DNA methylation [6, 8]. T. melanosporum belongs to the Pezizales, a largely unexplored group of ascomycetes that includes Ascobolus immersus, a fungus relying on premeiotic DNA methylation as a means to control repetitive element proliferation [9, 10].
Summary of sequencing data
Number of raw reads
Number of uniquely mapped
Coverage per strand (X)
T. melanosporum (Vittad.) mycelium from the Mel28 strain, the same strain utilized for reference genome sequencing , was grown on 1% malt agar (Cristomalt-D, Difal, Villefranche-sur-Saône, France) for 5 weeks before harvesting. A field-collected mature fruitbody was used for the FB library. ECM tips were from common hazel (Corylus avellana L.) plantlets inoculated with a T. melanosporum mycelium slurry (Raggi Vivai, Cesena, Italy). For 5-aza treatment, T. melanosporum mycelia were grown in the dark at 23 °C in synthetic liquid medium as described . Every 5 days 5-aza was added to mycelia at 10, 40, and 100 μM final concentrations (from a 10 mM stock solution in water) for 45 days (‘5-aza treated’), with the last addition made 24 hours before harvesting of mycelia and DNA/RNA extraction. The same volume of water, instead of 5-aza, was added to parallel control samples (‘5-aza untreated’).
Whole-genome bisulfite sequencing
Genomic DNA (gDNA) was extracted by grinding fruitbodies in liquid nitrogen followed by purification with the DNeasy Plant Mini kit (QIAGEN, Hilden, Germany). This was followed by extraction in 50% Phenol-50% extraction buffer (100 mM Tris–HCl pH 8.0, 100 mM NaCl, 20 mM EDTA, 1% SDS) at 65 °C for 10 min and centrifugation at 14,000 rpm for 10 min. The aqueous phase was transferred to new tubes and extracted twice with phenol:chloroform (1:1) and once with chloroform. Following ethanol precipitation, samples were resuspended in H2O. LiCl (2 M) precipitation was used to separate RNA (pellet) from gDNA (supernatant). gDNA was precipitated with ethanol and resuspended in H2O. Extracted gDNA was further purified with an additional phenol:chloroform extraction and sheared by sonication to generate DNA fragments in the 150–300 bp size range. Bisulfite treatment and library preparation were carried out as described , except that the EpiTect kit (QIAGEN) was utilized for bisulfite treatment. Two consecutive rounds of conversion were performed for a total of 10 hours. The resulting libraries were sequenced by Illumina sequencing technology (Hiseq 2000 sequencer; Illumina, San Diego, CA, USA).
Processing bisulfite converted reads
Bisulfite-converted reads were aligned to the reference genome (Tuber_melanosporum_v1.0) using BS Seeker 2 . We achieved 32X, 35X and 1.2X coverage per strand for FB, FLM and ECM, respectively (see Table 1). The ECM sample contained substantial amounts of root cells from the hazelnut host tree (C. avellana), whose genome has not yet been determined. Even at this low coverage, we were able to delineate global, low-resolution methylation profiles for the symbiotic (ECM) stage. Genome-wide DNA methylation profiles were generated by determining methylation levels for each cytosine in the genome. Since bisulfite treatment coupled to PCR amplification converts unmethylated cytosines (Cs) to thymines (Ts), the methylation level at each cytosine was estimated as #C/(#C + #T), where #C is the number of methylated reads and #T is the number of unmethylated reads. The methylation level per cytosine serves as an estimate of the percentage of cells bearing a methylated cytosine at a specific locus.
Number of cytosines included in the methylation analysis
Number of cytosines* (% genome coverage)
Average methylation levels of genome, genes, and transposable elements
BS-seq data coverage (i.e., read depth) across the genome revealed CNVs between FLM and FB. Specifically, we identified 107 genomic regions with significant CNV (defined as a 100 kb window with a |Log coverage ratio (FLM/FB)| ≥0.3), corresponding to 7.3% of the genome. One hundred and two (95%) of these regions were independently confirmed by standard Illumina sequencing performed on non-bisulfite-treated FLM genomic DNA. This high coincidence rate supports the notion that BS-seq data are unbiased for CNV calling.
RNA-seq library preparation and data analysis
Total RNA was extracted as described above, dissolved in H2O after LiCl precipitation and purified with the RNeasy Plant Mini kit (QIAGEN) followed by on-column DNase I digestion as per manufacturer’s instructions. RNA integrity was verified with a Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), which yielded RNA integrity number (RIN) scores of 7.0 and 6.5 for FB and FLM, respectively. RNA was quantified with the Qubit RNA BR Assay kit (Life technologies, Carlsbad, CA, USA) and 1 μg of purified RNA was utilized as starting material for library construction, which was carried out with the Illumina TruSeq RNA Sample Preparation kit as per manufacturer's instructions. Libraries were sequenced with an Illumina HiSeq 2000 system using 50-bp single-end reads.
Prediction of novel genes
We used the RNA-seq datasets listed in Table 1 (FB, FLM, FLM treated/untreated with 5-aza) to predict novel genes not included in the original annotation . The prediction pipeline was based on a previously published approach . First, reads were aligned to the reference genome (truffle v1.0 assembly) using TopHat , which resulted in the mapping of non-annotated regions. A guided transcript assembly using Cufflinks  with the truffle v1.0 assembly was then carried out for each sample. The resulting individual transcript assemblies were compared with the reference assembly using Cuffcompare  to identify novel transcripts.
Prediction of genes in RNA-seq datasets
Number of novel genes
Truffle v1.0 (7,496 genes)
DNA methylation and transcription analysis
Transcriptome analysis of truffle mycelia treated with 5-aza
Differentially expressed genes between untreated and 5-aza treated mycelia
RPKM No aza
adj. p value
Flavin-binding monooxygenase-like protein
Alpha-ketoglutarate-dependent taurine dioxygenase
Alcohol dehydrogenase 1
Alpha-ketoglutarate-dependent sulfonate dioxygenase protein
Fatty acid oxygenase
Short-chain alcohol dehydrogenases
FAD binding domain-containing protein
FAD binding domain-containing protein
Ferritin ribonucleotide reductase-like protein
D-isomer specific 2-hydroxyacid dehydrogenase
Amino acid permease
Carboxylic acid transport protein
Mate efflux family protein
MFS efflux transporter
Mg2+ transporter family-like protein
MFS general substrate transporter
MFS multidrug transporter
MFS drug efflux
Fungal transcriptional regulatory protein
STE12-like transcription factor
Binuclear zinc transcription factor
GAL4 domain-containing protein
C2H2 finger domain
Glycoside hydrolase family 61 protein
Endo- -beta-glucanase eng1
Glycoside hydrolase family 28 protein
Dynamin family protein
Glycerophosphoryl diester phosphodiesterase
Glycoside hydrolase family 16 protein
Fatty acid activator
Tat pathway signal sequence
Gnat family N-acetyltransferase
Branched-chain amino acid aminotransferase
Response to stimuli/mycelium development
Regulator of G protein signaling domain protein
Osmolarity two-component sensor histidine kinase SLN1
RNase P RPR2 RPP21 subunit domain-containing protein
Allergenic cerato-platanin ASP F13
Arrestin (or S-antigen) N-terminal domain protein
P-loop containing nucleoside triphosphate hydrolase
The composite RIP index (CRI) computation
The composite RIP index (CRI, ; see below ‘Availability and requirements of Software Used’) was implemented to measure CA = > TA enrichment within repeated sequences from Tuber and other organisms’ genomes. The Composite RIP Index (CRI) was calculated as (RIP product)-(RIP substrate) . The RIP product score measures the frequency of RIP products taking into account potential false positives due to local density, while the RIP substrate score measures the depletion of RIP targets and their reverse complement (e.g., CpA, and TpG).
To evaluate and compare CRI values in different organisms, we predicted repeats in T. melanosporum, Neurospora crassa, Uncinocarpus reesii, Aspergillus nidulans and Saccharomyces cerevisiae. Repeat annotations were predicted from the corresponding genome sequences [24–27] by RepeatMasker . Further details on CRI implementation, including raw data, processed data, scripts (python and R) and output files can be accessed at .
We provide a comprehensive genomic, epigenomic and transcriptomic sequencing data resource for the black truffle T. melanosporum. The use of a reversible, rather than an irreversible mechanism (such as RIP) to cope with the multitude of repeated transposable elements that populate the T. melanosporum genome resembles the situation in more complex organisms, including humans, and is in line with the view of a “generative” (i.e., genome-shaping) rather than a purely “destructive” role of transposable elements . More targeted follow-up studies built upon the results of the present work may uncover variations of the DNA methylation and transposon activity profiles associated to functionally interpretable (presumably adaptive) modifications of gene expression. If extended to T. melanosporum specimens from different geographic areas, epigenomic analyses, such as the one described in this work, may shed light on the relationships between DNA methylation, transposon-mediated genome shaping and commercially relevant organoleptic properties, such as aroma. In addition to their general evolutionary biological implications, our findings may thus provide a new mechanistic layer to explain intraspecific variability.
Availability and requirements of Software Used
Implementation of the Composite RIP Index
Project name: The composite RIP index (CRI) computation
Project home page: https://github.com/wwliao/critool/
Operating system(s): Platform-independent
Programming language: Python and R
Other requirements: Python 2.7, R
Any restrictions to use by non-academics: None
Availability of supporting data
For all sequencing data, in addition to those already described in  (GSE49700) we provide several formats to facilitate downstream data analysis: BS-seq data is presented in several formats -- a BAM file of alignments, ATCGmap for SNP calling and coverage analysis, CGmap for methylation analysis and a wiggle file for data visualization on genome browsers, such as IGV  (see Additional file 1: Table S1). For WG-seq data, BAM and ATCGmap are provided. For RNA-seq data, we provide BAM files and novel genes detected from the merged transcriptome assembly (see Additional file 2: Table S1). All the datasets presented here are publicly available in the GigaScience repository, GigaDB .
Copy Number Variation
Composite RIP Index
DNA Methyl Transferase
Repeat-Induced Point mutation
Reads Per Kilobase per Million mapped reads
This work was supported by a grant from Academia Sinica to PYC and by grants from the Fondazione Cariparma and the Ministry of Education, University and Research (MIUR) to SO and by a grant from the Institute for Genomics and Proteomics at UCLA to MP.
- Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE: Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008, 452: 215-219. 10.1038/nature06745.View ArticlePubMedPubMed CentralGoogle Scholar
- Feng S, Cokus SJ, Zhang X, Chen PY, Bostick M, Goll MG, Hetzel J, Jain J, Strauss SH, Halpern ME, Ukomadu C, Sadler KC, Pradhan S, Pellegrini M, Jacobsen SE: Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci U S A. 2010, 107: 8689-8694. 10.1073/pnas.1002720107.View ArticlePubMedPubMed CentralGoogle Scholar
- Lister R, O'Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR: Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell. 2008, 133: 523-536. 10.1016/j.cell.2008.03.029.View ArticlePubMedPubMed CentralGoogle Scholar
- Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462: 315-322. 10.1038/nature08514.View ArticlePubMedPubMed CentralGoogle Scholar
- Zemach A, McDaniel IE, Silva P, Zilberman D: Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science. 2010, 328: 916-919. 10.1126/science.1186366.View ArticlePubMedGoogle Scholar
- Zemach A, Zilberman D: Evolution of eukaryotic DNA methylation and the pursuit of safer sex. Curr Biol. 2010, 20: R780-785. 10.1016/j.cub.2010.07.007.View ArticlePubMedGoogle Scholar
- Martin F, Kohler A, Murat C, Balestrini R, Coutinho PM, Jaillon O, Montanini B, Morin E, Noel B, Percudani R, Porcel B, Rubini A, Amicucci A, Amselem J, Anthouard V, Arcioni S, Artiguenave F, Aury JM, Ballario P, Bolchi A, Brenna A, Brun A, Buee M, Cantarel B, Chevalier G, Couloux A, Da Silva C, Denoeud F, Duplessis S, Ghignone S: Perigord black truffle genome uncovers evolutionary origins and mechanisms of symbiosis. Nature. 2010, 464: 1033-1038. 10.1038/nature08867.View ArticlePubMedGoogle Scholar
- Bestor TH: Sex brings transposons and genomes into conflict. Genetica. 1999, 107: 289-295. 10.1023/A:1003990818251.View ArticlePubMedGoogle Scholar
- Faugeron G: Diversity of homology-dependent gene silencing strategies in fungi. Curr Opin Microbiol. 2000, 3: 144-148. 10.1016/S1369-5274(00)00066-7.View ArticlePubMedGoogle Scholar
- Percudani R, Trevisi A, Zambonelli A, Ottonello S: Molecular phylogeny of truffles (Pezizales: Terfeziaceae, Tuberaceae) derived from nuclear rDNA sequence analysis. Mol Phylogenet Evol. 1999, 13: 169-180. 10.1006/mpev.1999.0638.View ArticlePubMedGoogle Scholar
- Montanini B, Chen PY, Morselli M, Jaroszewicz A, Lopez D, Martin F, Ottonello S, Pellegrini M: Non-exhaustive DNA methylation-mediated transposon silencing in the black truffle genome, a complex fungal genome with massive repeat element content. Genome Biol. 2014, 15: 411-10.1186/s13059-014-0411-5.View ArticlePubMedPubMed CentralGoogle Scholar
- Lewis ZA, Honda S, Khlafallah TK, Jeffress JK, Freitag M, Mohn F, Schubeler D, Selker EU: Relics of repeat-induced point mutation direct heterochromatin formation in Neurospora crassa. Genome Res. 2009, 19: 427-437.View ArticlePubMedPubMed CentralGoogle Scholar
- Montanini B, Moretto N, Soragni E, Percudani R, Ottonello S: A high-affinity ammonium transporter from the mycorrhizal ascomycete Tuber borchii. Fungal Genet Biol. 2002, 36: 22-34. 10.1016/S1087-1845(02)00001-4.View ArticlePubMedGoogle Scholar
- Feng S, Rubbi L, Jacobsen SE, Pellegrini M: Determining DNA methylation profiles using sequencing. Methods Mol Biol. 2011, 733: 223-238. 10.1007/978-1-61779-089-8_16.View ArticlePubMedGoogle Scholar
- Guo W, Fiziev P, Yan W, Cokus S, Sun X, Zhang MQ, Chen PY, Pellegrini M: BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics. 2013, 14: 774-10.1186/1471-2164-14-774.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25: 1105-1111. 10.1093/bioinformatics/btp120.View ArticlePubMedPubMed CentralGoogle Scholar
- HTSeq: Analysing high-throughput sequencing data with Python.http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html,
- Chen PY, Montanini B, Liao WW, Morselli M, Jaroszewicz A, Lopez D, Ottonello S, Pellegrini M: A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data of black truffle, Tuber melanosporum. GigaScience Database. 2014,http://dx.doi.org/10.5524/100101,Google Scholar
- MycorWeb: Tuber melanosporum DB.http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno,
- Weikard R, Hadlich F, Kuehn C: Identification of novel transcripts and noncoding RNAs in bovine skin by deep next generation sequencing. BMC Genomics. 2013, 14: 789-10.1186/1471-2164-14-789.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010, 28: 511-515. 10.1038/nbt.1621.View ArticlePubMedPubMed CentralGoogle Scholar
- Gotz S, Garcia-Gomez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, Robles M, Talon M, Dopazo J, Conesa A: High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 2008, 36: 3420-3435. 10.1093/nar/gkn176.View ArticlePubMedPubMed CentralGoogle Scholar
- Broad Institute: Download Sequence - Neurospora crassa.http://www.broadinstitute.org/annotation/genome/neurospora/MultiDownloads.html,
- Broad Institute: Download Sequence - Uncinocarpus reesii.http://www.broadinstitute.org/annotation/genome/uncinocarpus_reesii.3/MultiDownloads.html,
- Broad Institute: Saccharomyces cerevisiae RM11-1a Database.http://www.broadinstitute.org/annotation/genome/saccharomyces_cerevisiae,
- Broad Institute: Download Sequence - Aspergilli.http://www.broadinstitute.org/annotation/genome/aspergillus_group/MultiDownloads.html,
- Computing Composite RIP index (CRI) in repetitive sequences.https://github.com/wwliao/critool,
- Fedoroff NV: Presidential address. Transposable elements, epigenetics, and genome evolution. Science. 2012, 338: 758-767. 10.1126/science.338.6108.758.View ArticlePubMedGoogle Scholar
- Thorvaldsdottir H, Robinson JT, Mesirov JP: Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013, 14: 178-192. 10.1093/bib/bbs017.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.