Catalogue Search | MBRL

FinnGen provides genetic insights from a well-phenotyped isolated population

by Aalto-Setälä, Katriina , Saarentaus, Elmo , Jacob, Howard in 45/43 , 631/208/205/2138 , 631/208/457/649/2219

2023

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored 1 , 2 . FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10 –11 ) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants. Genome-wide association studies of individuals from an isolated population (data from the Finnish biobank study FinnGen) and consequent meta-analyses facilitate the identification of previously unknown coding variant associations for both rare and common diseases.

Journal Article

Share this book

Add to My Shelf

Large-scale integration of the plasma proteome with genetics and disease

by Oddsson, Asmundur , Masson, Gisli , Fridriksdottir, Run in 45/23 , 45/43 , 45/91

2021

The plasma proteome can help bridge the gap between the genome and diseases. Here we describe genome-wide association studies (GWASs) of plasma protein levels measured with 4,907 aptamers in 35,559 Icelanders. We found 18,084 associations between sequence variants and levels of proteins in plasma (protein quantitative trait loci; pQTL), of which 19% were with rare variants (minor allele frequency (MAF) < 1%). We tested plasma protein levels for association with 373 diseases and other traits and identified 257,490 associations. We integrated pQTL and genetic associations with diseases and other traits and found that 12% of 45,334 lead associations in the GWAS Catalog are with variants in high linkage disequilibrium with pQTL. We identified 938 genes encoding potential drug targets with variants that influence levels of possible biomarkers. Combining proteomics, genomics and transcriptomics, we provide a valuable resource that can be used to improve understanding of disease pathogenesis and to assist with drug discovery and development. A genome-wide association study of plasma protein levels measured with 4,907 aptamers in 35,559 Icelanders highlights links with over 370 disease endpoints and other traits.

Journal Article

Share this book

Add to My Shelf

The Human Pangenome Project: a global resource to map genomic diversity

by Jarvis, Erich D. , Haussler, David , Schneider, Valerie A. in 45/23 , 631/114/2785 , 631/1647/2217

2022

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene–disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine. The Human Pangenome Reference Consortium aims to offer the highest quality and most complete human pangenome reference that provides diverse genomic representation across human populations.

Journal Article

Share this book

Add to My Shelf

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype

by Park, Chanhee , Bennett, Christopher , Salzberg, Steven L. in 631/114/2785 , 631/208/457 , 692/308/2056

2019

The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays. A graph-based genome indexing scheme enables variant-aware alignment of sequences with very low memory requirements.

Journal Article

Share this book

Add to My Shelf

Disease variant prediction with deep generative models of evolutionary data

by Min, Joseph K. , Frazer, Jonathan , Gal, Yarin in 631/114/1305 , 631/114/2397 , 631/208/2489/144

2021

Quantifying the pathogenicity of protein variants in human disease-related genes would have a marked effect on clinical decisions, yet the overwhelming majority (over 98%) of these variants still have unknown consequences 1 – 3 . In principle, computational methods could support the large-scale interpretation of genetic variants. However, state-of-the-art methods 4 – 10 have relied on training machine learning models on known disease labels. As these labels are sparse, biased and of variable quality, the resulting models have been considered insufficiently reliable 11 . Here we propose an approach that leverages deep generative models to predict variant pathogenicity without relying on labels. By modelling the distribution of sequence variation across organisms, we implicitly capture constraints on the protein sequences that maintain fitness. Our model EVE (evolutionary model of variant effect) not only outperforms computational approaches that rely on labelled data but also performs on par with, if not better than, predictions from high-throughput experiments, which are increasingly used as evidence for variant classification 12 – 16 . We predict the pathogenicity of more than 36 million variants across 3,219 disease genes and provide evidence for the classification of more than 256,000 variants of unknown significance. Our work suggests that models of evolutionary information can provide valuable independent evidence for variant interpretation that will be widely useful in research and clinical settings. A new computational method, EVE, classifies human genetic variants in disease genes using deep generative models trained solely on evolutionary sequences.

Journal Article

Share this book

Add to My Shelf

UK Biobank: a globally important resource for cancer research

by Omiyale, Wemimo , Sellers, Jonathan , Bešević, Jelena in Biobanks , Cancer , Cancer research

2023

UK Biobank is a large-scale prospective study with deep phenotyping and genomic data. Its open-access policy allows researchers worldwide, from academia or industry, to perform health research in the public interest. Between 2006 and 2010, the study recruited 502,000 adults aged 40–69 years from the general population of the United Kingdom. At enrolment, participants provided information on a wide range of factors, physical measurements were taken, and biological samples (blood, urine and saliva) were collected for long-term storage. Participants have now been followed up for over a decade with more than 52,000 incident cancer cases recorded. The study continues to be enhanced with repeat assessments, web-based questionnaires, multi-modal imaging, and conversion of the stored biological samples to genomic and other ‘–omic’ data. The study has already demonstrated its value in enabling research into the determinants of cancer, and future planned enhancements will make the resource even more valuable to cancer researchers. Over 26,000 researchers worldwide are currently using the data, performing a wide range of cancer research. UK Biobank is uniquely placed to transform our understanding of the causes of cancer development and progression, and drive improvements in cancer treatment and prevention over the coming decades.

Journal Article

Share this book

Add to My Shelf

Rare variant contribution to human disease in 281,104 UK Biobank exomes

by Deevi, Sri V. V. , Muthas, Daniel , Vitsios, Dimitrios in 45/43 , 631/208/1516 , 631/208/205/2138

2021

Genome-wide association studies have uncovered thousands of common variants associated with human disease, but the contribution of rare variants to common disease remains relatively unexplored. The UK Biobank contains detailed phenotypic data linked to medical records for approximately 500,000 participants, offering an unprecedented opportunity to evaluate the effect of rare variation on a broad collection of traits 1 , 2 . Here we study the relationships between rare protein-coding variants and 17,361 binary and 1,419 quantitative phenotypes using exome sequencing data from 269,171 UK Biobank participants of European ancestry. Gene-based collapsing analyses revealed 1,703 statistically significant gene–phenotype associations for binary traits, with a median odds ratio of 12.4. Furthermore, 83% of these associations were undetectable via single-variant association tests, emphasizing the power of gene-based collapsing analysis in the setting of high allelic heterogeneity. Gene–phenotype associations were also significantly enriched for loss-of-function-mediated traits and approved drug targets. Finally, we performed ancestry-specific and pan-ancestry collapsing analyses using exome sequencing data from 11,933 UK Biobank participants of African, East Asian or South Asian ancestry. Our results highlight a significant contribution of rare variants to common disease. Summary statistics are publicly available through an interactive portal ( http://azphewas.com/ ). The authors analyse rare protein-coding genetic variants for association with 18,780 traits in the UK Biobank cohort.

Journal Article

Share this book

Add to My Shelf

A cross-population atlas of genetic associations for 220 human phenotypes

by Ishigaki, Kazuyoshi , Akiyama, Masato , Terao, Chikashi in 45/23 , 45/43 , 45/61

2021

Current genome-wide association studies do not yet capture sufficient diversity in populations and scope of phenotypes. To expand an atlas of genetic associations in non-European populations, we conducted 220 deep-phenotype genome-wide association studies (diseases, biomarkers and medication usage) in BioBank Japan ( n = 179,000), by incorporating past medical history and text-mining of electronic medical records. Meta-analyses with the UK Biobank and FinnGen ( n total = 628,000) identified ~5,000 new loci, which improved the resolution of the genomic map of human traits. This atlas elucidated the landscape of pleiotropy as represented by the major histocompatibility complex locus, where we conducted HLA fine-mapping. Finally, we performed statistical decomposition of matrices of phenome-wide summary statistics, and identified latent genetic components, which pinpointed responsible variants and biological mechanisms underlying current disease classifications across populations. The decomposed components enabled genetically informed subtyping of similar diseases (for example, allergic diseases). Our study suggests a potential avenue for hypothesis-free re-investigation of human diseases through genetics. Genome-wide analyses in BioBank Japan, UK Biobank and FinnGen identify ~5,000 new loci associated with 220 human traits. Statistical decomposition of matrices of phenome-wide summary statistics further highlights variants underpinning diseases across populations.

Journal Article

Share this book

Add to My Shelf

The sequences of 150,119 genomes in the UK Biobank

by Masson, Gisli , Magnusdottir, Droplaug N. , Thorleifsson, Gudmar in 45/23 , 45/43 , 45/70

2022

Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data 1 , 2 . Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank 3 . This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation. To measure selection on variants, whole-genome sequencing of approximately 150,000 individuals from the UK Biobank is used to rank sequence variants by their level of depletion.

Journal Article

Share this book

Add to My Shelf

GestaltMatcher facilitates rare disease matching using facial phenotype descriptors

by Javanmardi, Behnam , Pantel, Jean Tori , Lyon, Gholson J. in 692/308/2056 , 692/308/575 , Agriculture

2022

Many monogenic disorders cause a characteristic facial morphology. Artificial intelligence can support physicians in recognizing these patterns by associating facial phenotypes with the underlying syndrome through training on thousands of patient photographs. However, this ‘supervised’ approach means that diagnoses are only possible if the disorder was part of the training set. To improve recognition of ultra-rare disorders, we developed GestaltMatcher, an encoder for portraits that is based on a deep convolutional neural network. Photographs of 17,560 patients with 1,115 rare disorders were used to define a Clinical Face Phenotype Space, in which distances between cases define syndromic similarity. Here we show that patients can be matched to others with the same molecular diagnosis even when the disorder was not included in the training set. Together with mutation data, GestaltMatcher could not only accelerate the clinical diagnosis of patients with ultra-rare disorders and facial dysmorphism but also enable the delineation of new phenotypes. GestaltMatcher uses a deep convolutional neural network to improve recognition of rare disorders based on facial morphology. The framework detects similarities among patients with previously unseen syndromes, aiding discovery of new disease genes.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter