The Genome Aggregation Database (gnomAD; http://gnomad.broadinstitute.org) contains 15,496 whole genome sequences of unrelated individuals that are used to aid variant interpretation for human diseases. However, variants in these genomes have currently only been called for the 22 autosomes and chromosome X, omitting variants in the mitochondrial genome (mtDNA). Mitochondria are maternally inherited organelles each containing many copies of the 16,569bp mtDNA. mtDNA encodes 13 proteins all involved in oxidative phosphorylation, which is required for generating ~90% of all cellular ATP. Over 400 mutations in mtDNA have been associated with human disease, and identifying all mtDNA variants in thousands of gnomAD genomes would provide an excellent resource to help researchers assess the frequency and interpretation of mitochondrial variants.
With the goal of including mtDNA variants in future releases of gnomAD, we have started a pilot study to identify and assess mtDNA variants in over 15,000 gnomAD genomes. The mtDNA is present in thousands of copies per cell, and the gnomAD genomes typically have >5000x mtDNA coverage. However, a specialized variant calling methodology is required because mtDNA variants can exist at any level of heteroplasmy (percent of mtDNA molecules with variant). To call mitochondrial variants, we ran mtDNA-Server (Weissensteiner et al. 2016), a tool that is optimized for identifying mitochondrial variants, detecting contamination, and determining haplogroups (maternal-line ancestry). We identified low heteroplasmic, high heteroplasmic, and homoplasmic variants among gnomAD individuals. To perform quality control, we assessed maternal transmission rates in mother-child pairs and in sample duplicates. All individuals were also classified into haplogroups, which aids in detecting contamination and provides additional information on ancestry to complement population assignment using the nuclear genomes. Including mtDNA variants in future releases of the gnomAD database will provide critical population frequency information to help researchers understand the role of mitochondrial variation in health and disease states.