Poster Presentation 39th Annual Lorne Genome Conference 2018

A general framework for evaluating cross-platform concordance in genomic studies (#229)

Timothy J Peters 1 , Hugh J French 2 , Stephen T Bradford 1 , Ruth Pidsley 1 , Susan J Clark 1 , Terry Speed 3
  1. Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
  2. South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales, Liverpool, NSW, Australia
  3. Bioinformatics Division, Walter & Eliza Hall Institute, Parkville, VIC, Australia

The reproducibility of scientific results from multiple sources is critical to the establishment of scientific doctrine. However, when characterising various genomic features (transcript/gene abundances, methylation levels, allele frequencies and the like), all measurements from any given technology are estimates and thus will retain some degree of error. Hence defining a “gold standard” process is dangerous, since all subsequent measurement comparisons will be biased towards that standard.

In the absence of a “gold standard” we instead empirically assess the precision and sensitivity of a large suite of genomic technologies via a consensus modelling method called the row-linear model. This method is an application of the American Society for Testing and Materials (ASTM) Standard E691 for assessing interlaboratory precision and sources of variability across multiple testing sites. We analyse three datasets (two RNA expression, one DNA methylation), each containing both sequencing and array technologies, allowing a direct per-technology, per-locus comparison of sensitivity and precision across all common loci. We assess the performance of a number of technologies including the Infinium MethylationEPIC BeadChip, whole genome bisulfite sequencing (WGBS), two different RNA-Seq protocols (PolyA+ and Ribo-Zero) and five different gene expression array platforms.

We implement and showcase a number of applications of the row-linear model, including direct comparisons of the sensitivity and precision of these platforms, correlation with known interfering traits related to probe and target biochemistry such as GC content, CDS length and cross-hybridisation, and the effect of normalisation on DNA methylation arrays. Our findings demonstrate the utility of the row-linear model in evincing varying levels of concordance between measurements on these platforms, serving as a process for identifying reproducibility caveats in studies where cross-platform validation is performed.