Catalogue Search | MBRL

Genome sequencing and implications for rare disorders

by Posey, Jennifer E. in Chromosomes , Clinical genetics and genomics , Diagnostic utility

2019

The practice of genomic medicine stands to revolutionize our approach to medical care, and to realize this goal will require discovery of the relationship between rare variation at each of the ~ 20,000 protein-coding genes and their consequent impact on individual health and expression of Mendelian disease. The step-wise evolution of broad-based, genome-wide cytogenetic and molecular genomic testing approaches (karyotyping, chromosomal microarray [CMA], exome sequencing [ES]) has driven much of the rare disease discovery to this point, with genome sequencing representing the newest member of this team. Each step has brought increased sensitivity to interrogate individual genomic variation in an unbiased method that does not require clinical prediction of the locus or loci involved. Notably, each step has also brought unique limitations in variant detection, for example, the low sensitivity of ES for detection of triploidy, and of CMA for detection of copy neutral structural variants. The utility of genome sequencing (GS) as a clinical molecular diagnostic test, and the increased sensitivity afforded by addition of long-read sequencing or other -omics technologies such as RNAseq or metabolomics, are not yet fully explored, though recent work supports improved sensitivity of variant detection, at least in a subset of cases. The utility of GS will also rely upon further elucidation of the complexities of genetic and allelic heterogeneity, multilocus rare variation, and the impact of rare and common variation at a locus, as well as advances in functional annotation of identified variants. Much discovery remains to be done before the potential utility of GS is fully appreciated.

Journal Article

Share this book

Add to My Shelf

Resolution of Disease Phenotypes Resulting from Multilocus Genomic Variation

by Bi, Weimin , Ding, Yan , Plon, Sharon E in Data processing , Diagnosis , Exome

2017

Of over 7000 patients referred to a diagnostic laboratory, 28% had diagnoses based on DNA sequencing, 5% of whom had two or more diagnoses. Their phenotypes could be better understood by considering whether the implicated genes affect independent biologic processes or organ systems. Medical genetics focuses on the relationship between observed phenotypes and their underlying genotypes, modes of transmission, and risks of recurrence. Expected patterns of mendelian inheritance are often used to confirm the identification of disease genes, and deviations from mendelian expectations have led to the discovery of more complicated genetic underpinnings of disease (Fig. S1 in the Supplementary Appendix, available with the full text of this article at NEJM.org). 1 – 8 Multiple (or dual) molecular diagnoses involve more than one clinical diagnosis and more than one genetic locus (Figure 1), each segregating independently. Diagnostic whole-exome sequencing affords opportunities for providing insights into relationships . . .

Journal Article

Share this book

Add to My Shelf

Phenotypic expansion illuminates multilocus pathogenic variation

by Yesil, Gozde , Gibbs, Richard A. , Karaca, Ender in Biomedical and Life Sciences , Biomedicine , Child, Preschool

2018

Multilocus variation—pathogenic variants in two or more disease genes—can potentially explain the underlying genetic basis for apparent phenotypic expansion in cases for which the observed clinical features extend beyond those reported in association with a “known” disease gene. Analyses focused on 106 patients, 19 for whom apparent phenotypic expansion was previously attributed to variation at known disease genes. We performed a retrospective computational reanalysis of whole-exome sequencing data using stringent Variant Call File filtering criteria to determine whether molecular diagnoses involving additional disease loci might explain the observed expanded phenotypes. Multilocus variation was identified in 31.6% (6/19) of families with phenotypic expansion and 2.3% (2/87) without phenotypic expansion. Intrafamilial clinical variability within two families was explained by multilocus variation identified in the more severely affected sibling. Our findings underscore the role of multiple rare variants at different loci in the etiology of genetically and clinically heterogeneous cohorts. Intrafamilial phenotypic and genotypic variability allowed a dissection of genotype–phenotype relationships in two families. Our data emphasize the critical role of the clinician in diagnostic genomic analyses and demonstrate that apparent phenotypic expansion may represent blended phenotypes resulting from pathogenic variation at more than one locus.

Journal Article

Share this book

Add to My Shelf

Improving automated deep phenotyping through large language models using retrieval-augmented generation

by Westerfield, Lauren , Gogate, Nikhita , Dawood, Moez in Application programming interface , Artificial intelligence , Automation

2025

Background Diagnosing rare genetic disorders relies on precise phenotypic and genotypic analysis, with the Human Phenotype Ontology (HPO) providing a standardized language for capturing clinical phenotypes. Rule-based HPO extraction tools use concept recognition to automatically identify phenotypes, but they often struggle with incomplete phenotype assignment, requiring significant manual review. While large language models (LLMs) hold promise for more context-driven phenotype extraction, they are prone to errors and “hallucinations,” making them less reliable without further refinement. We present RAG-HPO, a Python-based tool that leverages retrieval-augmented generation (RAG) to elevate accuracy of HPO term assignment by LLM. This approach bypasses the limitations of baseline models and eliminates the need for time- and resource-intensive fine-tuning. RAG-HPO integrates a dynamic vector database, containing > 54,000 phenotypic phrases mapped to HPO IDs, which allows real-time retrieval and contextual matching. The RAG-HPO workflow begins by extracting phenotypic phrases from clinical text via an LLM and then matching them via semantic similarity to entries within the database. The best term matches are returned to the LLM as context for final HPO term assignment of each phrase. Results Performance was benchmarked on 112 published case reports with 1792 manually assigned HPO terms and compared to Doc2HPO, ClinPhen, and FastHPOCR. In evaluations, RAG-HPO + LLaMa-3.1 70B achieved a mean precision of 0.81, recall of 0.76, and an F1 score of 0.78—significantly surpassing conventional tools ( p < 0.00001). RAG-HPO returned 1648 terms, of which 19.1% (315) were false positives that did not exactly match our manually annotated standard. Among these, < 1% (1/315) represented hallucinations, and 1.3% (4/315) represented terms with no ontological relationship to the desired target; the remaining false positives (95.2%, 300/315) were broader ancestor terms of the target term, which may still be relevant to users in many contexts. Conclusions RAG-HPO is a user-friendly, adaptable tool designed for secure evaluation of clinical text and outperforms standard HPO-matching tools in precision, recall, and F1. Its enhanced precision and recall represent a substantial advancement in phenotypic analysis, accelerating the identification of genetic mechanisms underlying rare diseases and driving progress in genetic research and clinical genomics. RAG-HPO is available at https://github.com/PoseyPod/RAG-HPO .

Journal Article

Share this book

Add to My Shelf

Using multiplexed functional data to reduce variant classification inequities in underrepresented populations

by Dawood, Moez , Coyote-Maestas, Willow , Fayer, Shawn in Benign , Bioinformatics , Biomedical and Life Sciences

2024

Background Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style functional data may help resolve variant classification disparities between populations, especially for Variants of Uncertain Significance (VUS). Methods We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource’s Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1 , TP53 , and PTEN . Results Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤ 5.95e − 06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤ 2.5e − 05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were increased in individuals of European-like genetic ancestry ( p ≤ 2.5e − 05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p = 9.1e − 03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p = 7.47e − 06) and computational predictor ( p = 6.92e − 05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.

Journal Article

Share this book

Add to My Shelf

PhenoDB, GeneMatcher and VariantMatcher, tools for analysis and sharing of sequence data

by Antonescu, Corina , Rodrigues, Eliete da S. , Wohler, Elizabeth in Automation , Clinical genetics and genomics , Collaboration

2021

Background With the advent of whole exome (ES) and genome sequencing (GS) as tools for disease gene discovery, rare variant filtering, prioritization and data sharing have become essential components of the search for disease genes and variants potentially contributing to disease phenotypes. The computational storage, data manipulation, and bioinformatic interpretation of thousands to millions of variants identified in ES and GS, respectively, is a challenging task. To aid in that endeavor, we constructed PhenoDB, GeneMatcher and VariantMatcher. Results PhenoDB is an accessible, freely available, web-based platform that allows users to store, share, analyze and interpret their patients’ phenotypes and variants from ES/GS data. GeneMatcher is accessible to all stakeholders as a web-based tool developed to connect individuals (researchers, clinicians, health care providers and patients) around the globe with interest in the same gene(s), variant(s) or phenotype(s). Finally, VariantMatcher was developed to enable public sharing of variant-level data and phenotypic information from individuals sequenced as part of multiple disease gene discovery projects. Here we provide updates on PhenoDB and GeneMatcher applications and implementation and introduce VariantMatcher. Conclusion Each of these tools has facilitated worldwide data sharing and data analysis and improved our ability to connect genes to phenotypic traits. Further development of these platforms will expand variant analysis, interpretation, novel disease-gene discovery and facilitate functional annotation of the human genome for clinical genomics implementation and the precision medicine initiative.

Journal Article

Share this book

Add to My Shelf

Multilocus pathogenic variants contribute to intrafamilial clinical heterogeneity: a retrospective study of sibling pairs with neurodevelopmental disorders

by Coban-Akdemir, Zeynep , Gibbs, Richard A. , Bozkurt-Yozgatli, Tugce in Analysis , Biological Variation, Population , Biomedical and Life Sciences

2024

Background Multilocus pathogenic variants (MPVs) are genetic changes that affect multiple gene loci or regions of the genome, collectively leading to multiple molecular diagnoses. MPVs may also contribute to intrafamilial phenotypic variability between affected individuals within a nuclear family. In this study, we aim to gain further insights into the influence of MPVs on a disease manifestation in individual research subjects and explore the complexities of the human genome within a familial context. Methods We conducted a systematic reanalysis of exome sequencing data and runs of homozygosity (ROH) regions of 47 sibling pairs previously diagnosed with various neurodevelopmental disorders (NDD). Results We found siblings with MPVs driven by long ROH regions in 8.5% of families (4/47). The patients with MPVs exhibited significantly higher F ROH values ( p -value = 1.4e-2) and larger total ROH length ( p -value = 1.8e-2). Long ROH regions mainly contribute to this pattern; the siblings with MPVs have a larger total size of long ROH regions than their siblings in all families ( p -value = 6.9e-3). Whereas the short ROH regions in the siblings with MPVs are lower in total size compared to their sibling pairs with single locus pathogenic variants ( p -value = 0.029), and there are no statistically significant differences in medium ROH regions between sibling pairs ( p -value = 0.52). Conclusion This study sheds light on the significance of considering MPVs in families with affected sibling pairs and the role of ROH as an adjuvant tool in explaining clinical variability within families. Identifying individuals carrying MPVs may have implications for disease management, identification of possible disease risks to different family members, genetic counseling and exploring personalized treatment approaches.

Journal Article

Share this book

Add to My Shelf

Correction: Multilocus pathogenic variants contribute to intrafamilial clinical heterogeneity: a retrospective study of sibling pairs with neurodevelopmental disorders

by Gibbs, Richard A. , Coban‑Akdemir, Zeynep , Posey, Jennifer E. in Biomedical and Life Sciences , Biomedicine , Correction

2024

Journal Article

Share this book

Add to My Shelf

NODAL variants are associated with a continuum of laterality defects from simple D-transposition of the great arteries to heterotaxy

by Dawood, Moez , Bi, Weimin , Fatih, Jawid M. in Alleles , Animals , Arteries

2024

Background NODAL signaling plays a critical role in embryonic patterning and heart development in vertebrates. Genetic variants resulting in perturbations of the TGF-β/NODAL signaling pathway have reproducibly been shown to cause laterality defects in humans. To further explore this association and improve genetic diagnosis, the study aims to identify and characterize a broader range of NODAL variants in a large number of individuals with laterality defects. Methods We re-analyzed a cohort of 321 proband-only exomes of individuals with clinically diagnosed laterality congenital heart disease (CHD) using family-based, rare variant genomic analyses. To this cohort we added 12 affected subjects with known NODAL variants and CHD from institutional research and clinical cohorts to investigate an allelic series. For those with candidate contributory variants, variant allele confirmation and segregation analysis were studied by Sanger sequencing in available family members. Array comparative genomic hybridization and droplet digital PCR were utilized for copy number variants (CNV) validation and characterization. We performed Human Phenotype Ontology (HPO)-based quantitative phenotypic analyses to dissect allele-specific phenotypic differences. Results Missense, nonsense, splice site, indels, and/or structural variants of NODAL were identified as potential causes of heterotaxy and other laterality defects in 33 CHD cases. We describe a recurrent complex indel variant for which the nucleic acid secondary structure predictions implicate secondary structure mutagenesis as a possible mechanism for formation. We identified two CNV deletion alleles spanning NODAL in two unrelated CHD cases. Furthermore, 17 CHD individuals were found (16/17 with known Hispanic ancestry) to have the c.778G > A:p.G260R NODAL missense variant which we propose reclassification from variant of uncertain significance (VUS) to likely pathogenic. Quantitative HPO-based analyses of the observed clinical phenotype for all cases with p.G260R variation, including heterozygous, homozygous, and compound heterozygous cases, reveal clustering of individuals with biallelic variation. This finding provides evidence for a genotypic-phenotypic correlation and an allele-specific gene dosage model. Conclusion Our data further support a role for rare deleterious variants in NODAL as a cause for sporadic human laterality defects, expand the repertoire of observed anatomical complexity of potential cardiovascular anomalies, and implicate an allele specific gene dosage model.

Journal Article

Share this book

Add to My Shelf

The multiple de novo copy number variant (MdnCNV) phenomenon presents with peri-zygotic DNA mutational signatures and multilocus pathogenic variation

by Yuan, Bo , Dawood, Moez , Bi, Weimin in Analysis , Bioinformatics , Biomedical and Life Sciences

2022

Background The multiple de novo copy number variant (M dn CNV) phenotype is described by having four or more constitutional de novo CNVs ( dn CNVs) arising independently throughout the human genome within one generation. It is a rare peri-zygotic mutational event, previously reported to be seen once in every 12,000 individuals referred for genome-wide chromosomal microarray analysis due to congenital abnormalities. These rare families provide a unique opportunity to understand the genetic factors of peri-zygotic genome instability and the impact of dn CNV on human diseases. Methods Chromosomal microarray analysis (CMA), array-based comparative genomic hybridization, short- and long-read genome sequencing (GS) were performed on the newly identified M dn CNV family to identify de novo mutations including dn CNVs, de novo single-nucleotide variants ( dn SNVs), and indels. Short-read GS was performed on four previously published M dn CNV families for dn SNV analysis. Trio-based rare variant analysis was performed on the newly identified individual and four previously published M dn CNV families to identify potential genetic etiologies contributing to the peri-zygotic genomic instability. Lin semantic similarity scores informed quantitative human phenotype ontology analysis on three M dn CNV families to identify gene(s) driving or contributing to the clinical phenotype. Results In the newly identified M dn CNV case, we revealed eight de novo tandem duplications, each ~ 1 Mb, with microhomology at 6/8 breakpoint junctions. Enrichment of de novo single-nucleotide variants (SNV; 6/79) and de novo indels (1/12) was found within 4 Mb of the dn CNV genomic regions. An elevated post-zygotic SNV mutation rate was observed in M dn CNV families. Maternal rare variant analyses identified three genes in distinct families that may contribute to the M dn CNV phenomenon. Phenotype analysis suggests that gene(s) within dn CNV regions contribute to the observed proband phenotype in 3/3 cases. CNVs in two cases, a contiguous gene duplication encompassing PMP22 and RAI1 and another duplication affecting NSD1 and SMARCC2 , contribute to the clinically observed phenotypic manifestations. Conclusions Characteristic features of dn CNVs reported here are consistent with a microhomology-mediated break-induced replication (MMBIR)-driven mechanism during the peri-zygotic period. Maternal genetic variants in DNA repair genes potentially contribute to peri-zygotic genomic instability. Variable phenotypic features were observed across a cohort of three M dn CNV probands, and computational quantitative phenotyping revealed that two out of three had evidence for the contribution of more than one genetic locus to the proband’s phenotype supporting the hypothesis of de novo multilocus pathogenic variation (MPV) in those families.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter