TY - JOUR
T1 - A gene-to-patient approach uplifts novel disease gene discovery and identifies 18 putative novel disease genes
AU - Seaby, Eleanor G
AU - Smedley, Damian
AU - Taylor Tavares, Ana Lisa
AU - Brittain, Helen
AU - van Jaarsveld, Richard H
AU - Baralle, Diana
AU - Rehm, Heidi L
AU - O'Donnell-Luria, Anne
AU - Ennis, Sarah
N1 - Funding Information:
This research was made possible through access to the data and findings generated by the 100,000 Genomes Project. The 100,000 Genomes Project is managed by Genomics England Limited (a wholly owned company of the Department of Health and Social Care). The 100,000 Genomes Project is funded by the National Institute for Health Research and NHS England. The Wellcome Trust, Cancer Research UK, and Medical Research Council have also funded research infrastructure. The 100,000 Genomes Project uses data provided by patients and collected by the National Health Service as part of their care and support. We would further like to extend our thanks to all the patients and their families for participation in the 100,000 Genomes Project. We are grateful to the Genome Aggregation Database teams for their helpful discussions in the development and application of constraint metrics in novel gene discovery.E.G.S. was supported by the Kerkut Charitable Trust and University of Southampton's Presidential Scholarship Award. H.L.R. was supported by the National Human Genome Research Institute (NHGRI) (U24 HG011450 and U41 HG006834). A.O.-L. was supported by the National Institute of Mental Health (U01 MH119689) and the Manton Center for Orphan Disease Research Scholar Award. E.G.S., H.L.R., and A.O.-L. were supported by NHGRI, the National Eye Institute, and the National Heart, Lung and Blood Institute (UM1 HG008900). D.B. was generously supported by a National Institute of Health Research Research Professorship (RP-2016-07-011).
Funding Information:
This research was made possible through access to the data and findings generated by the 100,000 Genomes Project. The 100,000 Genomes Project is managed by Genomics England Limited (a wholly owned company of the Department of Health and Social Care). The 100,000 Genomes Project is funded by the National Institute for Health Research and NHS England. The Wellcome Trust , Cancer Research UK , and Medical Research Council have also funded research infrastructure. The 100,000 Genomes Project uses data provided by patients and collected by the National Health Service as part of their care and support. We would further like to extend our thanks to all the patients and their families for participation in the 100,000 Genomes Project. We are grateful to the Genome Aggregation Database teams for their helpful discussions in the development and application of constraint metrics in novel gene discovery.
Funding Information:
E.G.S. was supported by the Kerkut Charitable Trust and University of Southampton’s Presidential Scholarship Award. H.L.R. was supported by the National Human Genome Research Institute ( NHGRI ) (U24 HG011450 and U41 HG006834). A.O.-L. was supported by the National Institute of Mental Health (U01 MH119689) and the Manton Center for Orphan Disease Research Scholar Award. E.G.S., H.L.R., and A.O.-L. were supported by NHGRI , the National Eye Institute , and the National Heart, Lung and Blood Institute (UM1 HG008900). D.B. was generously supported by a National Institute of Health Research Research Professorship (RP-2016-07-011).
Publisher Copyright:
© 2022 The Authors
PY - 2022/8
Y1 - 2022/8
N2 - PURPOSE: Exome and genome sequencing have drastically accelerated novel disease gene discoveries. However, discovery is still hindered by myriad variants of uncertain significance found in genes of undetermined biological function. This necessitates intensive functional experiments on genes of equal predicted causality, leading to a major bottleneck.METHODS: We apply the loss-of-function observed/expected upper-bound fraction metric of intolerance to gene inactivation to curate a list of predicted haploinsufficient disease genes. Using data from the 100,000 Genomes Project, we adopt a gene-to-patient approach that matches de novo loss-of-function variants in constrained genes to patients with rare disease. Through large-scale aggregation of data, we reduce excess analytical noise currently hindering novel discoveries.RESULTS: Results from 13,949 trios revealed 643 rare, de novo predicted loss-of-function events filtered from 1044 loss-of-function observed/expected upper-bound fraction-constrained genes. A total of 168 variants occurred within 126 genes without a known disease-gene relationship. Of these, 27 genes had >1 kindred affected, and for 18 of these genes, multiple kindreds had overlapping phenotypes. Two years after initial analysis, 11 of 18 (61%) of these genes have been independently published as novel disease gene discoveries.CONCLUSION: Using large cohorts and adopting gene-based approaches can rapidly and objectively accelerate dominantly inherited novel gene discovery by targeting the most appropriate genes for functional validation.
AB - PURPOSE: Exome and genome sequencing have drastically accelerated novel disease gene discoveries. However, discovery is still hindered by myriad variants of uncertain significance found in genes of undetermined biological function. This necessitates intensive functional experiments on genes of equal predicted causality, leading to a major bottleneck.METHODS: We apply the loss-of-function observed/expected upper-bound fraction metric of intolerance to gene inactivation to curate a list of predicted haploinsufficient disease genes. Using data from the 100,000 Genomes Project, we adopt a gene-to-patient approach that matches de novo loss-of-function variants in constrained genes to patients with rare disease. Through large-scale aggregation of data, we reduce excess analytical noise currently hindering novel discoveries.RESULTS: Results from 13,949 trios revealed 643 rare, de novo predicted loss-of-function events filtered from 1044 loss-of-function observed/expected upper-bound fraction-constrained genes. A total of 168 variants occurred within 126 genes without a known disease-gene relationship. Of these, 27 genes had >1 kindred affected, and for 18 of these genes, multiple kindreds had overlapping phenotypes. Two years after initial analysis, 11 of 18 (61%) of these genes have been independently published as novel disease gene discoveries.CONCLUSION: Using large cohorts and adopting gene-based approaches can rapidly and objectively accelerate dominantly inherited novel gene discovery by targeting the most appropriate genes for functional validation.
KW - Exome/genetics
KW - Genetic Association Studies
KW - Humans
KW - Phenotype
KW - Whole Exome Sequencing
KW - Novel gene discovery
KW - Genome sequencing
KW - Diagnostic uplift
KW - Disease genes
KW - Mendelian disease
UR - http://www.scopus.com/inward/record.url?scp=85132664078&partnerID=8YFLogxK
U2 - 10.1016/j.gim.2022.04.019
DO - 10.1016/j.gim.2022.04.019
M3 - Article
C2 - 35532742
SN - 1098-3600
VL - 24
SP - 1697
EP - 1707
JO - Genetics in medicine : official journal of the American College of Medical Genetics
JF - Genetics in medicine : official journal of the American College of Medical Genetics
IS - 8
ER -