Skip to main content

Paediatric leukaemia DNA methylation profiling using MBD enrichment and SOLiD sequencing on archival bone marrow smears



Acute Lymphoblastic Leukaemia (ALL) is the most common cancer in children. Over the past four decades, research has advanced the treatment of this cancer from a less than 60% chance of survival to over 85% today. The causal molecular mechanisms remain unclear. Here, we performed sequencing-based genomic DNA methylation profiling of eight paediatric ALL patients using archived bone marrow smear microscope slides.


SOLiD™ sequencing data was collected from Methyl-Binding Domain (MBD) enriched fractions of genomic DNA. The primary tumour and remission bone marrow sample was analysed from eight patients. Four patients relapsed and the relapsed tumour was analysed. Input and MBD-enriched DNA from each sample was sequenced, aligned to the hg19 reference genome and analysed for enrichment peaks using MACS (Model-based Analysis for ChIP-Seq) and HOMER (Hypergeometric Optimization of Motif EnRichment). In total, 3.67 gigabases (Gb) were sequenced, 2.74 Gb were aligned to the reference genome (average 74.66% alignment efficiency). This dataset enables the interrogation of differential DNA methylation associated with paediatric ALL. Preliminary results reveal concordant regions of enrichment indicative of a DNA methylation signature.


Our dataset represents one of the first SOLiD™MBD-Seq studies performed on paediatric ALL and is the first to utilise archival bone marrow smears. Differential DNA methylation between cancer and equivalent disease-free tissue can be identified and correlated with existing and published genomic studies. Given the rarity of paediatric haematopoietic malignancies, relative to adult counterparts, our demonstration of the utility of archived bone marrow smear samples to high-throughput methylation sequencing approaches offers tremendous potential to explore the role of DNA methylation in the aetiology of cancer.

Peer Review reports

Data description

This project was approved by the Royal Children’s Hospital Human Research Ethics Committee (RCH HREC# 29140C). We have performed Methyl-Binding Domain protein 2 (MBD2) enrichment and isolated fractions of DNA from 40 individuals for sequencing on the Sequencing by Oligonucleotide Ligation and Detection (SOLiD™) sequencing platform (SOLiD™MBD-Seq, Life Technologies, Carlsbad, USA). MBD2 has been shown to bind to double-stranded methylated DNA molecules and used to interrogate the human methylome [1]. By comparing the enriched fraction to the "input" total genomic DNA fraction, genomic regions of DNA methylation can be inferred after sequencing both fractions. The samples analysed are comprised of the following: three model cell lines, JWL (an in-house non-leukaemic cell line [2]), CEM-CCRF (childhood T-cell acute lymphoblastic leukaemia [ALL] cell line) and K562 (adult chronic myelogenous leukaemia cell line). From two non-leukaemic individuals (pbsc1 and pbsc2), peripheral blood mononuclear cells were sampled and four haematopoietic cell populations (CD34-positive, CD19-positive, CD33-positive and CD45-positive) were isolated for SOLiD™MBD-Seq analysis. From another two non-leukaemic individuals (bm9 and bm10), the same haematopoietic cell populations were isolated from bone marrow. Eight cases of childhood ALL were analysed with the identifiers 135, 197, 292, 316, 362, 367, 378 and 386 at diagnosis (leuk) and 28 days post induction chemotherapy (rem). A third set of samples was taken at relapse (lap) for cases 197, 316, 362 and 367 (Table 1).

Table 1 Samples analysed in this study and sequencing metrics

Genomic DNA from archived bone marrow smear microscope slides from ALL patients, cells and cell lines were extracted as previously described [3] and used for the enrichment of CpG methylation with the MethylMiner™ Methylated DNA enrichment kit (Life Technologies) according to the manufacturer’s protocols. The fragmented input genomic DNA (I) and enriched E5 fraction (E) were isolated from each sample for library preparation and sequencing using SOLiD™ v3 and v4 chemistry according to the manufacturer’s protocols (Life Technologies).

Single and paired-end SOLiD™ sequencing reads were aligned using LifeScope™ Genomic Analysis Suite (Life Technologies) with default parameters against the hg19 reference genome. Alignment efficiency (the ratio of uniquely aligned reads to total sequenced reads for each sample) ranged from 26.57% to 93.15% across all samples in this study (Table 1).

Alignments were then processed using MACS (Model-based Analysis for ChIP-Seq) [4] and HOMER (Hypergeometric Optimization of Motif EnRichment) [5,6] to identify enrichment peaks.

This study is unique in a number of ways. This is the first sequencing-based DNA methylation profiling study in childhood ALL using archived bone marrow samples of similar quality to formalin-fixed paraffin-embedded (FFPE) tissue samples [7]. We have selected samples that have been interrogated using an orthogonal platform, the Illumina Infinium Human Methylation 450K BeadArray [3,8], and included replicate samples to assess the reproducibility of SOLiD™MBD-Seq and to identify regions of differential DNA methylation of interest to childhood ALL.

We performed replicate DNA methylation enrichment analysis using the JWL cell line with 1 μg and 5 μg of starting genomic DNA to determine if 1 μg of starting material was sufficient for DNA methylation enrichment. This was less than the recommended quantity but a typical amount obtainable from our primary patient samples.

We isolated four haematopoietic cell populations (CD34, CD19, CD33, CD45) at major stages of development corresponding to the arrested stages of development in paediatric leukaemia. This was achieved by positive selection using fluorescent-labelled antibodies and Fluorescent Activated Cell Sorting (FACS) from four individuals. This would enable us to track changes in DNA methylation between cell lineages and contrast them with leukaemic cells. After MACS enrichment peak analysis, a large proportion of peaks were common between the CD19 cells from three individuals, confirming the premise of tissue-specific DNA methylation profiles in haematopoietic cells (Figure 1A).

Figure 1
figure 1

Venn diagrams summarising peak region overlaps between samples analysed in this study. Overlapping peak regions are shown after MACS peak analysis. (A) Peaks on chromosome 21 from three non-leukaemic individuals where CD19 cells were positively selected using FACS. A high degree of overlapping peaks were observed. (B) Peaks from matching leukaemic and remission samples from individual 135. Although there are some overlapping peaks (183), there are a substantial number of distinct peaks in each sample. (C) The extent of overlapping peaks between 3 leukaemic samples. (D) The extent of overlapping peaks between 3 remission samples.

When comparing DNA methylation enrichment peaks between leukaemic and remission samples (tumour versus normal) from the same individual, distinct enrichment peaks are seen; these are likely to correlate to disease state (Figure 1B). The number of overlapping peaks between leukaemic and remission samples were fewer compared to the haematopoietic cell analyses (Figure 1C and 1D) and could be indicative of the difference in sample qualities.

For each of the samples analysed in this study, we have generated track hubs that can be uploaded and visualised on the UCSC Genome Browser. This permits the immediate visualisation of regions of differential DNA methylation with potential biological significance. Moreover, we have performed Infinium analysis on these samples, and visualisation using the Genome Browser permits direct comparison to other publicly available data such as The Cancer Genome Atlas (TCGA) [9] and TARGET (Therapeutically Applicable Research to Generate Effective Treatments) [10]. This also permits further analysis and comparison to publicly available data using the Galaxy [11,12] and Cistrome [13] web servers.

In summary, our data represent one of the first DNA methylation enrichment analyses using SOLiD™MBD-Seq on archival bone marrow smears from children diagnosed with ALL. Such specimens are readily available in most pathology laboratories across the world and are amenable to genomic-scale analysis, as we have demonstrated here. These data should prove valuable for other DNA methylation studies in childhood ALL in haematopoeitic cell development.

Availability of supporting data

Supporting data is available from the GigaScience Database, GigaDB [14] and at NCBI under BioProject PRJNA272864.

Data file details

  • SRA Files included BioProject PRJNA272864

  • MACS and HOMER output files of peaks and peak locations

  • Track Hubs for UCSC Genome Browser



Acute lymphoblastic leukaemia


Hypergeometric Optimization of Motif EnRichment


Model-based Analysis for ChIP-Seq


Methyl-binding domain


  1. Serre D, Lee B, Ting A. MBD-isolated Genome Sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome. Nucleic Acids Res. Jan 2010;38(2):391–9. doi: 10.1093/nar/gkp992.

  2. Voullaire L, Saffery R, Davies J, Earle E, Kalitsis P, Slater H, et al. Trisomy 20p resulting from inverted duplication and neocentromere formation. Am J Med Genet. 1999; 85(4):403–8.

    Article  CAS  PubMed  Google Scholar 

  3. Wong NC, Ashley D, Chatterton Z, Parkinson-Bates M, Ng H-K, Halemba MS, et al. A distinct DNA methylation signature defines pediatric pre-B cell acute lymphoblastic leukemia. Epigenetics : Official J DNA Methylation Soc. 2012; 7(6):535–41.

    Article  CAS  Google Scholar 

  4. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008; 9(9):137.

    Article  Google Scholar 

  5. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Molecular Cell. 2010; 38(4):576–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. HOMER. Homer.

  7. Aplenc R, Orudjev E, Swoyer J, Manke B, Rebbeck T. Leukemia : Official J Leuk Soc Am Leuk Res Fund, UK. 2002; 16(9):1865–6.

  8. Chatterton Z, Morenos L, Mechinaud F, Ashley DM, Craig JM, Sexton-Oates A, et al. Epigenetic deregulation in pediatric acute lymphoblastic leukemia.Epigenetics : Official J DNA Methylation Soc. 2014; 9(3):459–67.

    Article  CAS  Google Scholar 

  9. Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013; 45(10):1113–20.

    Article  PubMed  PubMed Central  Google Scholar 

  10. TARGET. Target.

  11. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, et al. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005; 15(10):1451–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Galaxy. Galaxy.

  13. Liu T, Ortiz JA, Taing L, Meyer CA, Lee B, Zhang Y, et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 2011; 12(8):83.

    Article  Google Scholar 

  14. Wong NCL, Meredith GD, Marnellos G, Dudas M, Parkinson-Bates M, Halemba MS, et al. Supporting material for: Paediatric Leukaemia DNA methylation profiling using MBD enrichment and SOLiD Sequencing on archival bone marrow smears. GigaScience Database.

Download references


The authors would like to thank Mike Payne and Ivonne Petermann from Life Technologies for technical support. We also thank Dr Fernando Rosello (Monash University) for his assistance in running the LifeScope Alignments. NCLW was supported by the Leukaemia Foundation, My Room, Victorian Cancer Agency and the National Health and Medical Research Council. RS is supported by a Senior Research Fellowship, National Health and Medical Research Council. The Murdoch Childrens Research Institute is supported by the Victorian Government Operational and Infrastructure Support Grant. Computational analysis was performed on the Peak Computing Facility at the Victorian Life Sciences Computational Initiative (VLSCI) under project VR0002.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nicholas CL Wong.

Additional information

Competing interests

Wong NCL, Parkinson-Bates M, Halemba MS, Chatterton Z, Ashley DM, Mechinaud F, Craig JM and Saffery R declare that they have no competing interests. Meredith GD and Dudas M are paid employees of Life Technologies. Marnellos G was a paid employee of Life Technologies and now is a paid employee of Harvard University.

Authors’ contributions

NCLW, DMA, JMC and RS formulated and designed the experiments in consultation with GDM. NCLW, MPB, MSH, ZC and MD performed the experiments. DMA and FM provided clinical input in this study. NCLW, JM and GM analysed the sequencing data. NCLW wrote the first draft of this paper. All authors assisted in the final draft, have read and approved the final manuscript.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wong, N.C., Meredith, G.D., Marnellos, G. et al. Paediatric leukaemia DNA methylation profiling using MBD enrichment and SOLiD sequencing on archival bone marrow smears. GigaSci 4, 11 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: