Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
37
result(s) for
"Facciotti, Marc T."
Sort by:
An Integrated Pipeline for de Novo Assembly of Microbial Genomes
2012
Remarkable advances in DNA sequencing technology have created a need for de novo genome assembly methods tailored to work with the new sequencing data types. Many such methods have been published in recent years, but assembling raw sequence data to obtain a draft genome has remained a complex, multi-step process, involving several stages of sequence data cleaning, error correction, assembly, and quality control. Successful application of these steps usually requires intimate knowledge of a diverse set of algorithms and software. We present an assembly pipeline called A5 (Andrew And Aaron's Awesome Assembly pipeline) that simplifies the entire genome assembly process by automating these stages, by integrating several previously published algorithms with new algorithms for quality control and automated assembly parameter selection. We demonstrate that A5 can produce assemblies of quality comparable to a leading assembly algorithm, SOAPdenovo, without any prior knowledge of the particular genome being assembled and without the extensive parameter tuning required by the other assembly algorithm. In particular, the assemblies produced by A5 exhibit 50% or more reduction in broken protein coding sequences relative to SOAPdenovo assemblies. The A5 pipeline can also assemble Illumina sequence data from libraries constructed by the Nextera (transposon-catalyzed) protocol, which have markedly different characteristics to mechanically sheared libraries. Finally, A5 has modest compute requirements, and can assemble a typical bacterial genome on current desktop or laptop computer hardware in under two hours, depending on depth of coverage.
Journal Article
Evaluation of Algorithm Performance in ChIP-Seq Peak Detection
2010
Next-generation DNA sequencing coupled with chromatin immunoprecipitation (ChIP-seq) is revolutionizing our ability to interrogate whole genome protein-DNA interactions. Identification of protein binding sites from ChIP-seq data has required novel computational tools, distinct from those used for the analysis of ChIP-Chip experiments. The growing popularity of ChIP-seq spurred the development of many different analytical programs (at last count, we noted 31 open source methods), each with some purported advantage. Given that the literature is dense and empirical benchmarking challenging, selecting an appropriate method for ChIP-seq analysis has become a daunting task. Herein we compare the performance of eleven different peak calling programs on common empirical, transcription factor datasets and measure their sensitivity, accuracy and usability. Our analysis provides an unbiased critical assessment of available technologies, and should assist researchers in choosing a suitable tool for handling ChIP-seq data.
Journal Article
Phylogenetically Driven Sequencing of Extremely Halophilic Archaea Reveals Strategies for Static and Dynamic Osmo-response
by
Wu, Dongying
,
Larsen, David
,
Facciotti, Marc T.
in
Acidification
,
Adaptation, Physiological - genetics
,
Agricultural production
2014
Organisms across the tree of life use a variety of mechanisms to respond to stress-inducing fluctuations in osmotic conditions. Cellular response mechanisms and phenotypes associated with osmoadaptation also play important roles in bacterial virulence, human health, agricultural production and many other biological systems. To improve understanding of osmoadaptive strategies, we have generated 59 high-quality draft genomes for the haloarchaea (a euryarchaeal clade whose members thrive in hypersaline environments and routinely experience drastic changes in environmental salinity) and analyzed these new genomes in combination with those from 21 previously sequenced haloarchaeal isolates. We propose a generalized model for haloarchaeal management of cytoplasmic osmolarity in response to osmotic shifts, where potassium accumulation and sodium expulsion during osmotic upshock are accomplished via secondary transport using the proton gradient as an energy source, and potassium loss during downshock is via a combination of secondary transport and non-specific ion loss through mechanosensitive channels. We also propose new mechanisms for magnesium and chloride accumulation. We describe the expansion and differentiation of haloarchaeal general transcription factor families, including two novel expansions of the TATA-binding protein family, and discuss their potential for enabling rapid adaptation to environmental fluxes. We challenge a recent high-profile proposal regarding the evolutionary origins of the haloarchaea by showing that inclusion of additional genomes significantly reduces support for a proposed large-scale horizontal gene transfer into the ancestral haloarchaeon from the bacterial domain. The combination of broad (17 genera) and deep (≥5 species in four genera) sampling of a phenotypically unified clade has enabled us to uncover both highly conserved and specialized features of osmoadaptation. Finally, we demonstrate the broad utility of such datasets, for metagenomics, improvements to automated gene annotation and investigations of evolutionary processes.
Journal Article
Candidatus Frankia Datiscae Dg1, the Actinobacterial Microsymbiont of Datisca glomerata, Expresses the Canonical nod Genes nodABC in Symbiosis with Its Host Plant
by
Vanden Heuvel, Brian
,
Pujic, Petar
,
Demina, Irina V.
in
Amino acids
,
Analysis
,
Arbuscular mycorrhizas
2015
Frankia strains are nitrogen-fixing soil actinobacteria that can form root symbioses with actinorhizal plants. Phylogenetically, symbiotic frankiae can be divided into three clusters, and this division also corresponds to host specificity groups. The strains of cluster II which form symbioses with actinorhizal Rosales and Cucurbitales, thus displaying a broad host range, show suprisingly low genetic diversity and to date can not be cultured. The genome of the first representative of this cluster, Candidatus Frankia datiscae Dg1 (Dg1), a microsymbiont of Datisca glomerata, was recently sequenced. A phylogenetic analysis of 50 different housekeeping genes of Dg1 and three published Frankia genomes showed that cluster II is basal among the symbiotic Frankia clusters. Detailed analysis showed that nodules of D. glomerata, independent of the origin of the inoculum, contain several closely related cluster II Frankia operational taxonomic units. Actinorhizal plants and legumes both belong to the nitrogen-fixing plant clade, and bacterial signaling in both groups involves the common symbiotic pathway also used by arbuscular mycorrhizal fungi. However, so far, no molecules resembling rhizobial Nod factors could be isolated from Frankia cultures. Alone among Frankia genomes available to date, the genome of Dg1 contains the canonical nod genes nodA, nodB and nodC known from rhizobia, and these genes are arranged in two operons which are expressed in D. glomerata nodules. Furthermore, Frankia Dg1 nodC was able to partially complement a Rhizobium leguminosarum A34 nodC::Tn5 mutant. Phylogenetic analysis showed that Dg1 Nod proteins are positioned at the root of both α- and β-rhizobial NodABC proteins. NodA-like acyl transferases were found across the phylum Actinobacteria, but among Proteobacteria only in nodulators. Taken together, our evidence indicates an Actinobacterial origin of rhizobial Nod factors.
Journal Article
A Large and Phylogenetically Diverse Class of Type 1 Opsins Lacking a Canonical Retinal Binding Site
by
Kind, Tobias
,
Wang, Ting
,
Facciotti, Marc T.
in
Absorption, Radiation
,
Analysis
,
Archaea - metabolism
2016
Opsins are photosensitive proteins catalyzing light-dependent processes across the tree of life. For both microbial (type 1) and metazoan (type 2) opsins, photosensing depends upon covalent interaction between a retinal chromophore and a conserved lysine residue. Despite recent discoveries of potential opsin homologs lacking this residue, phylogenetic dispersal and functional significance of these abnormal sequences have not yet been investigated. We report discovery of a large group of putatively non-retinal binding opsins, present in a number of fungal and microbial genomes and comprising nearly 30% of opsins in the Halobacteriacea, a model clade for opsin photobiology. We report phylogenetic analyses, structural modeling, genomic context analysis and biochemistry, to describe the evolutionary relationship of these recently described proteins with other opsins, show that they are expressed and do not bind retinal in a canonical manner. Given these data, we propose a hypothesis that these abnormal opsin homologs may represent a novel family of sensory opsins which may be involved in taxis response to one or more non-light stimuli. If true, this finding would challenge our current understanding of microbial opsins as a light-specific sensory family, and provides a potential analogy with the highly diverse signaling capabilities of the eukaryotic G-protein coupled receptors (GPCRs), of which metazoan type 2 opsins are a light-specific sub-clade.
Journal Article
Schiff base switch II precedes the retinal thermal isomerization in the photocycle of bacteriorhodopsin
by
Facciotti, Marc T
,
Wang, Ting
,
Duan, Yong
in
Accessibility
,
Bacteriorhodopsin
,
Bacteriorhodopsins - chemistry
2013
In bacteriorhodopsin, the order of molecular events that control the cytoplasmic or extracellular accessibility of the Schiff bases (SB) are not well understood. We use molecular dynamics simulations to study a process involved in the second accessibility switch of SB that occurs after its reprotonation in the N intermediate of the photocycle. We find that once protonated, the SB C15 = NZ bond switches from a cytoplasmic facing (13-cis, 15-anti) configuration to an extracellular facing (13-cis, 15-syn) configuration on the pico to nanosecond timescale. Significantly, rotation about the retinal's C13 = C14 double bond is not observed. The dynamics of the isomeric state transitions of the protonated SB are strongly influenced by the surrounding charges and dielectric effects of other buried ions, particularly D96 and D212. Our simulations indicate that the thermal isomerization of retinal from 13-cis back to all-trans likely occurs independently from and after the SB C15 = NZ rotation in the N-to-O transition.
Journal Article
Gene Gangs of the Chloroviruses: Conserved Clusters of Collinear Monocistronic Genes
by
Facciotti, Marc
,
Jeanniard, Adrien
,
Dunigan, David
in
Base Sequence
,
BASIC BIOLOGICAL SCIENCES
,
Chlorovirus
2018
Chloroviruses (family Phycodnaviridae) are dsDNA viruses found throughout the world’s inland waters. The open reading frames in the genomes of 41 sequenced chloroviruses (330 ± 40 kbp each) representing three virus types were analyzed for evidence of evolutionarily conserved local genomic “contexts”, the organization of biological information into units of a scale larger than a gene. Despite a general loss of synteny between virus types, we informatically detected a highly conserved genomic context defined by groups of three or more genes that we have termed “gene gangs”. Unlike previously described local genomic contexts, the definition of gene gangs requires only that member genes be consistently co-localized and are not constrained by strand, regulatory sites, or intervening sequences (and therefore represent a new type of conserved structural genomic element). An analysis of functional annotations and transcriptomic data suggests that some of the gene gangs may organize genes involved in specific biochemical processes, but that this organization does not involve their coordinated expression.
Journal Article
Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux
by
Facciotti, Marc T.
,
Eisen, Jonathan A.
,
Langille, Morgan G. I.
in
Acids
,
Analysis
,
Animal behavior
2012
We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ∼20×coverage and assembled to an average of 50 contigs (range 5 scaffolds-168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology.
Journal Article
Prevalence of transcription promoters within archaeal operons and coding sequences
by
Deutsch, Eric W
,
Peterson, Amelia
,
Baliga, Nitin S
in
Archaea
,
BASIC BIOLOGICAL SCIENCES
,
Biochemistry & Molecular Biology
2009
Despite the knowledge of complex prokaryotic‐transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well‐defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome‐wide characterization of transcript structures of ∼64% of all genes, including putative non‐coding RNAs in
Halobacterium salinarum NRC‐1
. Our integrative analysis of transcriptome dynamics and protein–DNA interaction data sets showed widespread environment‐dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3′ ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes—events usually considered spurious or non‐functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.
Synopsis
Evidence is mounting that the standard model of transcription factor (TF) binding to intergenic regions is not always the rule. Although there is isolated prior evidence for functional consequences of TF binding inside coding sequences, this issue had not been systematically evaluated genome wide. We have conducted a study to investigate the genome‐wide consequence of internal TF binding for nearly 10% of all TFs in an archaeal extremophile,
Halobacterium salinarum NRC‐1
. We show that a significant number of TF‐binding sites (TFBS) inside the coding sequences are functional and have marked consequences, such as by conditionally modulating the architecture of at least 43% of all operons in this organism. We present the integrated analysis of complementary systems‐wide data on TFBS locations and dynamic modulation of transcriptome structure that led to this striking discovery.
Using ChIP–chip and the
MeDiChI
algorithm (Reiss
et al
,
2008
), we precisely located TFBSs and determined their corresponding local false discovery rates (
LFDRs
) from new and previously reported genome‐wide ChIP–chip measurements for 11 TFs: all TFBs (TFBa, TFBb, TFBc, TFBd, TFBe, TFBf and TFBg), one TBP (TBPb) and three transcriptional regulators (TRs) (Trh3, Trh4, VNG1451C) in
H. salinarum NRC‐1
. Our conclusion from this analysis was that as many as 10% of all multi‐TFBS loci were within coding regions.
To show that these TFBS have significant functional consequences on transcriptional regulation and cellular physiology, we used high‐density genome tiling arrays to analyze the transcriptome structure (TS) of
H. salinarum NRC‐1
at different phases of growth in a batch culture, which is associated with differential regulation of over 65% of all genes. Through this analysis we assigned transcription start sites (TSSs) to 64% of all annotated genes, termination sites (TTSs) to 46% of the genes, verified the expression of 203 operons and discovered 5′and 3′ UTRs for ∼65% of all genes and operons. Further, by correlating the transcribed units with chromosomal coordinates of predicted genes (Ng
et al
,
2000
) and experimentally mapped peptides from large‐scale proteomics studies (Van
et al
,
2008
), we revised the translation start site for 61 genes, detected 10 new protein‐coding genes, and discovered 61 new putative ncRNAs. Although the physiological roles and mechanisms of action of specific ncRNAs remain to be uncovered, the bimodal distribution of correlations between the expression of ncRNAs and that of their antisense strands are consistent with the characterized roles of ncRNAs in the regulation of their cognate antisense transcripts. Finally, this analysis also showed a large mRNA population that has variable 3′‐end locations and transcripts with extensive overlaps in their 3′ termini.
By integrating TFBS locations with the TS, we identified internal binding sites that are functional in the conditional modulation of operon organization. We assessed the global prevalence of such operons by devising a quantitative measure for classifying operons as conditional. Specifically, we found that 43% of all operons are conditionally modulated by integrating probe intensities of transcripts hybridized to the genome tiling array with gene‐expression correlations derived from expression analysis of
H. salinarum NRC‐1
in 719 microarray experiments. Remarkably, there was a strong functional link between transcription‐factor binding inside operons and their classification as ‘conditional’ (
P
<10
−9
). We transcriptionally fused two of these conditionally activated promoters inside coding sequences to a reporter gene encoding a fast‐degrading GFP variant optimized for the high‐salt cytoplasm of halophilic archaea. FACS analysis of cells harboring these internal promoter–reporter transcriptional fusions provided
in vivo
validation of growth‐phase regulated transcription initiation inside coding sequences.
Although earlier studies have discovered internal promoters within a single gene or operon (Tsui
et al
,
1994
; Guillot and Moran,
2007
), we have significantly extended these findings to a genome‐wide scale to show that biologically meaningful promoters do exist inside coding sequences at a frequency that is much higher than was previously appreciated. Further, this discovery also shows how a simple prokaryote can use the same set of genes in different combinations to elicit complex responses according to an environmental challenge.
Irrespective of the specific underlying mechanisms, our observations of widespread modulation of operon architecture, as well as transcription initiation and termination inside genes, etc. all constitute evidence that archaea can intersperse regulatory logic within their coding sequence and thus blur the boundaries between coding and non‐coding elements. We have shown that it is possible to use new high‐throughput technologies to find these biologically important instances where transcriptional regulation does occur within coding sequences and, furthermore, that it is possible to globally characterize specific regulatory mechanisms responsible for these phenomena. Combined with new high‐throughput sequencing technologies, our results will expand the view of genetic‐information processing that can be investigated at high resolution (Nagalakshmi
et al
,
2008
; Wilhelm
et al
,
2008
). These data will enable construction of mechanistically accurate models for reliable systems re‐engineering of biological circuits. Moreover, these findings suggest that the incorporation of mechanistic accuracy into GRN models would require operons, promoters, and terminators to be treated as dynamic entities.
A systematic evaluation of transcription factor binding site loci (TFBS) for nearly 10% of all TFs in
Halobacterium salinarum NRC‐1
via ChIP‐chip demonstrated that a significant fraction of TFBS loci (as many as ~10% of multi‐TFBS loci for 11 TFs) fell within coding regions.
By correlating the dynamic changes in the transcriptome structure (TS) of
H. salinarum NRC‐1
during a complex cellular response with genome‐wide binding locations of TFs and peptides from proteomics experiments, we have (i) characterized transcription start sites and termination sites for ~64% of all genes in this organism; and discovered (ii) new protein coding genes, (iii) 61 novel ncRNA candidates, (iv) 5' and 3' untranslated regions (UTRs) of mRNAs, (v) a large mRNA population with variable 3' end locations, and (vi) transcripts with extensive overlaps in their 3' termini.
By integrating TFBS locations with the TS, we demonstrate that a significant number of TF binding events inside coding regions are indeed functional with important consequences such as in mediating conditional modulation of at least 43% of all investigated operons (p <10
‐9
).
These findings suggest that the construction of a mechanistically accurate model of a gene regulatory network would have to consider operons, promoters, and terminators as dynamically changing elements.
Journal Article