Poster Presentation 39th Annual Lorne Genome Conference 2018

Developing a bioinformatics pipeline to measure changes in ribosomal RNA genes copy number in cancer (#245)

Diksha Sharma 1 , Sylvie Hermann Le Denmat 1 2 3 , Justin O'Sullivan 4 , Katherine M. Hannan 5 , Ross D. Hannan 5 , Austen R.D. Ganley 1 6
  1. School of Biological Sciences, University of Auckland, Auckland, New Zealand
  2. Ecole Normale Superiere-Paris, Paris, France
  3. Institute for Integrative Biology of the Cell, Paris, France
  4. Liggins Institute, University of Auckland, Auckland, New Zealand
  5. Department of Cancer Biology and Therapeutics, Australian National University, Canberra, Australia
  6. Maurice Wilkins Centre for Molecular Biodiscovery, University of Auckland, Auckland, New Zealand

 

Cancer is a leading cause of death worldwide. A hallmark characteristic of cancer is abnormal nucleolar morphology. Nucleoli house the ribosomal RNA genes (rDNA) that encode ribosomal RNA, which is the major structural and catalytic component of ribosomes. In eukaryotes, the rDNA is organized as tandem arrays of repeats that exhibit a high degree of variability in copy number within and between species.

Recent evidence suggests that rDNA copy number changes in malignancy. However, determining rDNA copy number in mammals is technically challenging, and most methods have not been properly validated. The current dominant method uses whole genome sequencing read coverage (or depth) as a proxy for rDNA copy number. This method assumes that average coverage represents the true coverage value across both the rDNA and the whole genome. However, there are regions with high or low coverage across the rDNA repeat unit, presumably due to the presence of interspersed repeats (Alu elements), tandem repeats (microsatellites), and sequencing bias. This coverage variability might result in inaccurate estimation of the final average coverage level, hence giving a false estimation of the rDNA copy number. 

To overcome these limitations, we have developed an approach that uses the most frequent coverage value to calculate copy number. The methodology is based on the assumption that the most frequent coverage value represents the true coverage value. We have validated our method using yeast strains with varying, known rDNA copy numbers. This validated system will be employed using paired cancer-normal whole genome sequence data to measure the variation in rDNA copy number between different individuals and between different tissue types. This will enable us to establish the normal human variation in rDNA copy number, and to assess whether the rDNA copy number changes previously observed in malignant cells fall outside this normal range of variation.