GeneSet Information

Tier IV GS357320 • Epilepsy dominant genes from patients from three disease panels myopathy, epilepsy and RASopathies

from Publication Assignment: 206

DESCRIPTION:

This study used the Python scikit-learn machine learning library to train a logistic regression model to predicy the pathogenicity of missense variants from clinical panels. Variants were classified by two clinical labs using standard variant interpretation protocols from ACMG/AMP guidelines. For this gene set a subset of epilepsy dominant genes were also considered. These genes account for a large number of epilepsy pathogenic variants and because they follow a dominant inheritance pattern, may have distinct characteristics impacting variant prediction relative to all other epilepsy genes. All genes in this gene set were cross-checked with HGNC.

LABEL:

Epilepsy dominant genes

SCORE TYPE:

P-Value

DATE ADDED:

2019-08-28

DATE UPDATED:

2024-04-25

SPECIES:

AUTHORS:

Perry Evans, Chao Wu, Amanda Lindy, Dianalee A McKnight, Matthew Lebo, Mahdi Sarmady, Ahmad N Abou Tayoun

TITLE:

Genetic variant pathogenicity prediction trained using disease-specific clinical sequencing data sets.

JOURNAL:

Genome research 07 2019, Vol 29, pp. 1144-1151

ABSTRACT:

Recent advances in DNA sequencing have expanded our understanding of the molecular basis of genetic disorders and increased the utilization of clinical genomic tests. Given the paucity of evidence to accurately classify each variant and the difficulty of experimentally evaluating its clinical significance, a large number of variants generated by clinical tests are reported as variants of unknown clinical significance. Population-scale variant databases can improve clinical interpretation. Specifically, pathogenicity prediction for novel missense variants can use features describing regional variant constraint. Constrained genomic regions are those that have an unusually low variant count in the general population. Computational methods have been introduced to capture these regions and incorporate them into pathogenicity classifiers, but these methods have yet to be compared on an independent clinical variant data set. Here, we introduce one variant data set derived from clinical sequencing panels and use it to compare the ability of different genomic constraint metrics to determine missense variant pathogenicity. This data set is compiled from 17,071 patients surveyed with clinical genomic sequencing for cardiomyopathy, epilepsy, or RASopathies. We further use this data set to demonstrate the necessity of disease-specific classifiers and to train PathoPredictor, a disease-specific ensemble classifier of pathogenicity based on regional constraint and variant-level features. PathoPredictor achieves an average precision >90% for variants from all 99 tested disease genes while approaching 100% accuracy for some genes. The accumulation of larger clinical variant training data sets can significantly enhance their performance in a disease- and gene-specific manner. PUBMED: 31235655
Find other GeneSets from this publication

Annotation Information

No sequence read archive data associated with this GeneSet.


No annotations are associated with this GeneSet.

Gene List • 12 Genes

Uploaded As Gene Symbol Homology Score Priority LinkOuts Emphasis