Catalogue Search | MBRL

Horizontal Transfer, Not Duplication, Drives the Expansion of Protein Families in Prokaryotes

by Treangen, Todd J. , Rocha, Eduardo P. C. in Bacillus , Bacillus - genetics , Bacteria - genetics

2011

Gene duplication followed by neo- or sub-functionalization deeply impacts the evolution of protein families and is regarded as the main source of adaptive functional novelty in eukaryotes. While there is ample evidence of adaptive gene duplication in prokaryotes, it is not clear whether duplication outweighs the contribution of horizontal gene transfer in the expansion of protein families. We analyzed closely related prokaryote strains or species with small genomes (Helicobacter, Neisseria, Streptococcus, Sulfolobus), average-sized genomes (Bacillus, Enterobacteriaceae), and large genomes (Pseudomonas, Bradyrhizobiaceae) to untangle the effects of duplication and horizontal transfer. After removing the effects of transposable elements and phages, we show that the vast majority of expansions of protein families are due to transfer, even among large genomes. Transferred genes--xenologs--persist longer in prokaryotic lineages possibly due to a higher/longer adaptive role. On the other hand, duplicated genes--paralogs--are expressed more, and, when persistent, they evolve slower. This suggests that gene transfer and gene duplication have very different roles in shaping the evolution of biological systems: transfer allows the acquisition of new functions and duplication leads to higher gene dosage. Accordingly, we show that paralogs share most protein-protein interactions and genetic regulators, whereas xenologs share very few of them. Prokaryotes invented most of life's biochemical diversity. Therefore, the study of the evolution of biology systems should explicitly account for the predominant role of horizontal gene transfer in the diversification of protein families.

Journal Article

Share this book

Add to My Shelf

Development and Characterization of a High Density SNP Genotyping Assay for Cattle

by Smith, Timothy P.L , Schnabel, Robert D , Van Tassell, Curtis P in Agriculture , Algorithms , Analysis

2009

The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle.

Journal Article

Share this book

Add to My Shelf

Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology

by Lage, Kasper , Xavier, Ramnik J. , Raychaudhuri, Soumya in Arthritis, Rheumatoid - genetics , Arthritis, Rheumatoid - immunology , Biology

2011

Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease.

Journal Article

Share this book

Add to My Shelf

Meta-Analysis of 28,141 Individuals Identifies Common Variants within Five New Loci That Influence Uric Acid Concentrations

by Watkins, Hugh , Whitfield, John B. , Sanna, Serena in Aging , Care and treatment , Carrier proteins

2009

Elevated serum uric acid levels cause gout and are a risk factor for cardiovascular disease and diabetes. To investigate the polygenetic basis of serum uric acid levels, we conducted a meta-analysis of genome-wide association scans from 14 studies totalling 28,141 participants of European descent, resulting in identification of 954 SNPs distributed across nine loci that exceeded the threshold of genome-wide significance, five of which are novel. Overall, the common variants associated with serum uric acid levels fall in the following nine regions: SLC2A9 (p = 5.2x10(-201)), ABCG2 (p = 3.1x10(-26)), SLC17A1 (p = 3.0x10(-14)), SLC22A11 (p = 6.7x10(-14)), SLC22A12 (p = 2.0x10(-9)), SLC16A9 (p = 1.1x10(-8)), GCKR (p = 1.4x10(-9)), LRRC16A (p = 8.5x10(-9)), and near PDZK1 (p = 2.7x10(-9)). Identified variants were analyzed for gender differences. We found that the minor allele for rs734553 in SLC2A9 has greater influence in lowering uric acid levels in women and the minor allele of rs2231142 in ABCG2 elevates uric acid levels more strongly in men compared to women. To further characterize the identified variants, we analyzed their association with a panel of metabolites. rs12356193 within SLC16A9 was associated with DL-carnitine (p = 4.0x10(-26)) and propionyl-L-carnitine (p = 5.0x10(-8)) concentrations, which in turn were associated with serum UA levels (p = 1.4x10(-57) and p = 8.1x10(-54), respectively), forming a triangle between SNP, metabolites, and UA levels. Taken together, these associations highlight additional pathways that are important in the regulation of serum uric acid levels and point toward novel potential targets for pharmacological intervention to prevent or treat hyperuricemia. In addition, these findings strongly support the hypothesis that transport proteins are key in regulating serum uric acid levels.

Journal Article

Share this book

Add to My Shelf

Genomic Islands in the Pathogenic Filamentous Fungus Aspergillus fumigatus

by Fedorova, Natalie D. , Joardar, Vinita S. , Venter, J. Craig in Allergens - genetics , Aspergillus - classification , Aspergillus - genetics

2008

We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome. While the core genes are 99.8% identical at the nucleotide level, identity for variable genes can be as low 40%. The most divergent loci appear to contain heterokaryon incompatibility (het) genes associated with fungal programmed cell death such as developmental regulator rosA. Cross-species comparison has revealed that 8.5%, 13.5% and 12.6%, respectively, of A. fumigatus, N. fischeri and A. clavatus genes are species-specific. These genes are significantly smaller in size than core genes, contain fewer exons and exhibit a subtelomeric bias. Most of them cluster together in 13 chromosomal islands, which are enriched for pseudogenes, transposons and other repetitive elements. At least 20% of A. fumigatus-specific genes appear to be functional and involved in carbohydrate and chitin catabolism, transport, detoxification, secondary metabolism and other functions that may facilitate the adaptation to heterogeneous environments such as soil or a mammalian host. Contrary to what was suggested previously, their origin cannot be attributed to horizontal gene transfer (HGT), but instead is likely to involve duplication, diversification and differential gene loss (DDL). The role of duplication in the origin of lineage-specific genes is further underlined by the discovery of genomic islands that seem to function as designated \"gene dumps\" and, perhaps, simultaneously, as \"gene factories\".

Journal Article

Share this book

Add to My Shelf

Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome

by Degenhardt, Jeremiah D. , Boyko, Adam R. , Hernandez, Ryan D. in Alleles , Amino Acid Substitution , Animals

2008

Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27-29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30-42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10-20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits.

Journal Article

Share this book

Add to My Shelf

Recombinational Landscape and Population Genomics of Caenorhabditis elegans

by Kruglyak, Leonid , Rockman, Matthew V. in Animals , Caenorhabditis elegans , Caenorhabditis elegans - genetics

2009

Recombination rate and linkage disequilibrium, the latter a function of population genomic processes, are the critical parameters for mapping by linkage and association, and their patterns in Caenorhabditis elegans are poorly understood. We performed high-density SNP genotyping on a large panel of recombinant inbred advanced intercross lines (RIAILs) of C. elegans to characterize the landscape of recombination and, on a panel of wild strains, to characterize population genomic patterns. We confirmed that C. elegans autosomes exhibit discrete domains of nearly constant recombination rate, and we show, for the first time, that the pattern holds for the X chromosome as well. The terminal domains of each chromosome, spanning about 7% of the genome, exhibit effectively no recombination. The RIAILs exhibit a 5.3-fold expansion of the genetic map. With median marker spacing of 61 kb, they are a powerful resource for mapping quantitative trait loci in C. elegans. Among 125 wild isolates, we identified only 41 distinct haplotypes. The patterns of genotypic similarity suggest that some presumed wild strains are laboratory contaminants. The Hawaiian strain, CB4856, exhibits genetic isolation from the remainder of the global population, whose members exhibit ample evidence of intercrossing and recombining. The population effective recombination rate, estimated from the pattern of linkage disequilibrium, is correlated with the estimated meiotic recombination rate, but its magnitude implies that the effective rate of outcrossing is extremely low, corroborating reports of selection against recombinant genotypes. Despite the low population, effective recombination rate and extensive linkage disequilibrium among chromosomes, which are techniques that account for background levels of genomic similarity, permit association mapping in wild C. elegans strains.

Journal Article

Share this book

Add to My Shelf

Efficient and Accurate Construction of Genetic Linkage Maps from the Minimum Spanning Tree of a Graph

by Lonardi, Stefano , Bhat, Prasanna R. , Wu, Yonghui in Algorithms , Chromosome Mapping - statistics & numerical data , Cluster Analysis

2008

Genetic linkage maps are cornerstones of a wide spectrum of biotechnology applications, including map-assisted breeding, association genetics, and map-assisted gene cloning. During the past several years, the adoption of high-throughput genotyping technologies has been paralleled by a substantial increase in the density and diversity of genetic markers. New genetic mapping algorithms are needed in order to efficiently process these large datasets and accurately construct high-density genetic maps. In this paper, we introduce a novel algorithm to order markers on a genetic linkage map. Our method is based on a simple yet fundamental mathematical property that we prove under rather general assumptions. The validity of this property allows one to determine efficiently the correct order of markers by computing the minimum spanning tree of an associated graph. Our empirical studies obtained on genotyping data for three mapping populations of barley (Hordeum vulgare), as well as extensive simulations on synthetic data, show that our algorithm consistently outperforms the best available methods in the literature, particularly when the input data are noisy or incomplete. The software implementing our algorithm is available in the public domain as a web tool under the name MSTmap.

Journal Article

Share this book

Add to My Shelf

Bridgehead effect in the worldwide invasion of the biocontrol Harlequin Ladybird

by Guillemaud, Thomas , Lombaert, Eric , Interactions Biotiques et Santé Végétale (IBSV) ; Institut National de la Recherche Agronomique (INRA) in African Americans , Analysis , Animal Migration

2010

Recent studies of the routes of worldwide introductions of alien organisms suggest that many widespread invasions could have stemmed not from the native range, but from a particularly successful invasive population, which serves as the source of colonists for remote new territories. We call here this phenomenon the invasive bridgehead effect. Evaluating the likelihood of such a scenario is heuristically challenging. We solved this problem by using approximate Bayesian computation methods to quantitatively compare complex invasion scenarios based on the analysis of population genetics (microsatellite variation) and historical (first observation dates) data. We applied this approach to the Harlequin ladybird Harmonia axyridis (HA), a coccinellid native to Asia that was repeatedly introduced as a biocontrol agent without becoming established for decades. We show that the recent burst of worldwide invasions of HA followed a bridgehead scenario, in which an invasive population in eastern North America acted as the source of the colonists that invaded the European, South American and African continents, with some admixture with a biocontrol strain in Europe. This demonstration of a mechanism of invasion via a bridgehead has important implications both for invasion theory (i. e., a single evolutionary shift in the bridgehead population versus multiple changes in case of introduced populations becoming invasive independently) and for ongoing efforts to manage invasions by alien organisms (i. e., heightened vigilance against invasive bridgeheads)

Journal Article

Share this book

Add to My Shelf

The History of African Gene Flow into Southern Europeans, Levantines, and Jews

by Hao, Li , Moorjani, Priya , Ostrer, Harry in African Continental Ancestry Group - genetics , Asian Continental Ancestry Group , Bias

2011

Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West Eurasian populations after the initial dispersal out of Africa that occurred at least 45,000 years ago. However, there has been no accurate characterization of the proportion of mixture, or of its date. We analyze genome-wide polymorphism data from about 40 West Eurasian groups to show that almost all Southern Europeans have inherited 1%-3% African ancestry with an average mixture date of around 55 generations ago, consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations. Levantine groups harbor 4%-15% African ancestry with an average mixture date of about 32 generations ago, consistent with close political, economic, and cultural links with Egypt in the late middle ages. We also detect 3%-5% sub-Saharan African ancestry in all eight of the diverse Jewish populations that we analyzed. For the Jewish admixture, we obtain an average estimated date of about 72 generations. This may reflect descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diasporas.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter