Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
585
result(s) for
"Myers, Simon"
Sort by:
A method for genome-wide genealogy estimation for thousands of samples
2019
Knowledge of genome-wide genealogies for thousands of individuals would simplify most evolutionary analyses for humans and other species, but has remained computationally infeasible. We have developed a method, Relate, scaling to >10,000 sequences while simultaneously estimating branch lengths, mutational ages and variable historical population sizes, as well as allowing for data errors. Application to 1,000 Genomes Project haplotypes produces joint genealogical histories for 26 human populations. Highly diverged lineages are present in all groups, but most frequent in Africa. Outside Africa, these mainly reflect ancient introgression from groups related to Neanderthals and Denisovans, while African signals instead reflect unknown events unique to that continent. Our approach allows more powerful inferences of natural selection than has previously been possible. We identify multiple regions under strong positive selection, and multi-allelic traits including hair color, body mass index and blood pressure, showing strong evidence of directional selection, varying among human groups.
Relate is a new method for evolutionary analysis of large genetic datasets that can estimate branch lengths, mutational ages and variable historical population sizes.
Journal Article
Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups
2019
Salter-Townshend and Myers present an open source tool for modelling multi-way admixture events using dense haplotype data. Their Hidden Markov Model approach is scalable to thousands of samples and, unlike existing methods...
We present an algorithm for inferring ancestry segments and characterizing admixture events, which involve an arbitrary number of genetically differentiated groups coming together. This allows inference of the demographic history of the species, properties of admixing groups, identification of signatures of natural selection, and may aid disease gene mapping. The algorithm employs nested hidden Markov models to obtain local ancestry estimation along the genome for each admixed individual. In a range of simulations, the accuracy of these estimates equals or exceeds leading existing methods. Moreover, and unlike these approaches, we do not require any prior knowledge of the relationship between subgroups of donor reference haplotypes and the unseen mixing ancestral populations. Our approach infers these in terms of conditional “copying probabilities.” In application to the Human Genome Diversity Project, we corroborate many previously inferred admixture events (e.g., an ancient admixture event in the Kalash). We further identify novel events such as complex four-way admixture in San-Khomani individuals, and show that Eastern European populations possess 1−3% ancestry from a group resembling modern-day central Asians. We also identify evidence of recent natural selection favoring sub-Saharan ancestry at the human leukocyte antigen (HLA) region, across North African individuals. We make available an R and C++ software library, which we term MOSAIC (which stands for MOSAIC Organizes Segments of Ancestry In Chromosomes).
Journal Article
Inference of Population Structure using Dense Haplotype Data
by
Falush, Daniel
,
Myers, Simon
,
Lawson, Daniel John
in
Algorithms
,
Biology
,
Computer Simulation
2012
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this \"chromosome painting\" can be summarized as a \"coancestry matrix,\" which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/.
Journal Article
Recombination in the Human Pseudoautosomal Region PAR1
by
Noor, Nudrat
,
Hinch, Anjali G.
,
Altemose, Nicolas
in
African Americans
,
Biology and Life Sciences
,
Cells
2014
The pseudoautosomal region (PAR) is a short region of homology between the mammalian X and Y chromosomes, which has undergone rapid evolution. A crossover in the PAR is essential for the proper disjunction of X and Y chromosomes in male meiosis, and PAR deletion results in male sterility. This leads the human PAR with the obligatory crossover, PAR1, to having an exceptionally high male crossover rate, which is 17-fold higher than the genome-wide average. However, the mechanism by which this obligatory crossover occurs remains unknown, as does the fine-scale positioning of crossovers across this region. Recent research in mice has suggested that crossovers in PAR may be mediated independently of the protein PRDM9, which localises virtually all crossovers in the autosomes. To investigate recombination in this region, we construct the most fine-scale genetic map containing directly observed crossovers to date using African-American pedigrees. We leverage recombination rates inferred from the breakdown of linkage disequilibrium in human populations and investigate the signatures of DNA evolution due to recombination. Further, we identify direct PRDM9 binding sites using ChIP-seq in human cells. Using these independent lines of evidence, we show that, in contrast with mouse, PRDM9 does localise peaks of recombination in the human PAR1. We find that recombination is a far more rapid and intense driver of sequence evolution in PAR1 than it is on the autosomes. We also show that PAR1 hotspot activities differ significantly among human populations. Finally, we find evidence that PAR1 hotspot positions have changed between human and chimpanzee, with no evidence of sharing among the hottest hotspots. We anticipate that the genetic maps built and validated in this work will aid research on this vital and fascinating region of the genome.
Journal Article
The Length of Haplotype Blocks and Signals of Structural Variation in Reconstructed Genealogies
by
Myers, Simon R
,
Ignatieva, Anastasia
,
Favero, Martina
in
Algorithms
,
Approximation
,
Chromosome 10
2025
Abstract
Recent breakthroughs have enabled the accurate inference of large-scale genealogies. Through modelling the impact of recombination on the correlation structure between genealogical local trees, we evaluate how this structure is reconstructed by leading approaches. Despite identifying pervasive biases, we show that applying a simple correction recovers the desired distributions for one algorithm, Relate. We develop a statistical test to identify clades spanning unexpectedly long genomic regions, likely reflecting regional suppression of recombination in some individuals. Our approach allows a systematic scan for inter-individual recombination rate variation at an intermediate scale, between genome-wide differences and individual hotspots. Using genealogies reconstructed with Relate for 2,504 human genomes, we identify 50 regions possessing clades with unexpectedly long genomic spans (P<1⋅10−12). The strongest signal corresponds to a known inversion on chromosome 17. The second strongest uncovers a novel 760-kb inversion on chromosome 10, common (21%) in S. Asians and correlated with GWAS hits for a range of phenotypes. Other regions indicate additional genomic rearrangements: inversions (8), copy number changes (2), or other variants (12). The remaining regions appear to reflect recombination suppression by previously unevidenced mechanisms. They are enriched for precisely spanning single genes (P=5⋅10−10), specifically those expressed in male gametogenesis, and for eQTLs (P=2⋅10−3). This suggests an extension of previously hypothesized crossover suppression within meiotic genes, towards a model of suppression varying across individuals with different expression levels. Our methods can be readily applied to other species, showing that genealogies offer previously untapped potential to study structural variation and other phenomena impacting evolution.
Journal Article
Relating pathogenic loss-of-function mutations in humans to their evolutionary fitness costs
by
Agarwal, Ipsita
,
Myers, Simon R
,
Przeworski, Molly
in
Analysis
,
Autistic Disorder - genetics
,
Case-Control Studies
2023
Causal loss-of-function (LOF) variants for Mendelian and severe complex diseases are enriched in 'mutation intolerant' genes. We show how such observations can be interpreted in light of a model of mutation-selection balance and use the model to relate the pathogenic consequences of LOF mutations at present to their evolutionary fitness effects. To this end, we first infer posterior distributions for the fitness costs of LOF mutations in 17,318 autosomal and 679 X-linked genes from exome sequences in 56,855 individuals. Estimated fitness costs for the loss of a gene copy are typically above 1%; they tend to be largest for X-linked genes, whether or not they have a Y homolog, followed by autosomal genes and genes in the pseudoautosomal region. We compare inferred fitness effects for all possible de novo LOF mutations to those of de novo mutations identified in individuals diagnosed with one of six severe, complex diseases or developmental disorders. Probands carry an excess of mutations with estimated fitness effects above 10%; as we show by simulation, when sampled in the population, such highly deleterious mutations are typically only a couple of generations old. Moreover, the proportion of highly deleterious mutations carried by probands reflects the typical age of onset of the disease. The study design also has a discernible influence: a greater proportion of highly deleterious mutations is detected in pedigree than case-control studies, and for autism, in simplex than multiplex families and in female versus male probands. Thus, anchoring observations in human genetics to a population genetic model allows us to learn about the fitness effects of mutations identified by different mapping strategies and for different traits.
Journal Article
Inferring Population Histories for Ancient Genomes Using Genome-Wide Genealogies
2021
Ancient genomes anchor genealogies in directly observed historical genetic variation and contextualize ancestral lineages with archaeological insights into their geography and cultural associations. However, the majority of ancient genomes are of lower coverage and cannot be directly built into genealogies. Here, we present a fast and scalable method, Colate, the first approach for inferring ancestral relationships through time between low-coverage genomes without requiring phasing or imputation. Our approach leverages sharing patterns of mutations dated using a genealogy to infer coalescence rates. For deeply sequenced ancient genomes, we additionally introduce an extension of the Relate algorithm for joint inference of genealogies incorporating such genomes. Application to 278 present-day and 430 ancient DNA samples of >0.5x mean coverage allows us to identify dynamic population structure and directional gene flow between early farmer and European hunter-gatherer groups. We further show that the previously reported, but still unexplained, increase in the TCC/TTC mutation rate, which is strongest in West Eurasia today, was already present at similar strength and widespread in the Late Glacial Period ~10k−15k years ago, but is not observed in samples >30k years old. It is strongest in Neolithic farmers, and highly correlated with recent coalescence rates between other genomes and a 10,000-year-old Anatolian hunter-gatherer. This suggests gene-flow among ancient peoples postdating the last glacial maximum as widespread and localizes the driver of this mutational signal in both time and geography in that region. Our approach should be widely applicable in future for addressing other evolutionary questions, and in other species.
Journal Article
A high-resolution map of non-crossover events reveals impacts of genetic diversity on mammalian meiotic recombination
2019
During meiotic recombination, homologue-templated repair of programmed DNA double-strand breaks (DSBs) produces relatively few crossovers and many difficult-to-detect non-crossovers. By intercrossing two diverged mouse subspecies over five generations and deep-sequencing 119 offspring, we detect thousands of crossover and non-crossover events genome-wide with unprecedented power and spatial resolution. We find that both crossovers and non-crossovers are strongly depleted at DSB hotspots where the DSB-positioning protein PRDM9 fails to bind to the unbroken homologous chromosome, revealing that PRDM9 also functions to promote homologue-templated repair. Our results show that complex non-crossovers are much rarer in mice than humans, consistent with complex events arising from accumulated non-programmed DNA damage. Unexpectedly, we also find that GC-biased gene conversion is restricted to non-crossover tracts containing only one mismatch. These results demonstrate that local genetic diversity profoundly alters meiotic repair pathway decisions via at least two distinct mechanisms, impacting genome evolution and
Prdm9
-related hybrid infertility.
During meiotic recombination, genetic information is transferred or exchanged between parental chromosome copies. Using a large hybrid mouse pedigree, the authors generated high-resolution maps of these transfer/exchange events and discovered new properties governing their processing and resolution.
Journal Article
A new multipoint method for genome-wide association studies by imputation of genotypes
by
Myers, Simon
,
McVean, Gil
,
Donnelly, Peter
in
Agriculture
,
Animal Genetics and Genomics
,
Biological and medical sciences
2007
Genome-wide association studies are set to become the method of choice for uncovering the genetic basis of human diseases. A central challenge in this area is the development of powerful multipoint methods that can detect causal variants that have not been directly genotyped. We propose a coherent analysis framework that treats the problem as one involving missing or uncertain genotypes. Central to our approach is a model-based imputation method for inferring genotypes at observed or unobserved SNPs, leading to improved power over existing methods for multipoint association mapping. Using real genome-wide association study data, we show that our approach (i) is accurate and well calibrated, (ii) provides detailed views of associated regions that facilitate follow-up studies and (iii) can be used to validate and correct data at genotyped markers. A notable future use of our method will be to boost power by combining data from genome-wide scans that use different SNP sets.
Journal Article
A Genetic Atlas of Human Admixture History
2014
Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years. We identified events whose dates and participants suggest they describe genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.
Journal Article