Catalogue Search | MBRL

Towards practical, high-capacity, low-maintenance information storage in synthesized DNA

by Bertone, Paul , Birney, Ewan , Chen, Siyuan in 631/114/552 , 639/301/54/992 , 639/705/258

2013

An efficient and scalable strategy with robust error correction is reported for encoding a record amount of information (including images, text and audio files) in DNA strands; a ‘DNA archive’ has been synthesized, shipped from the USA to Germany, sequenced and the information read. Long-term DNA archives make sense This multidisciplinary study in synthetic biology both proposes and demonstrates a system for the DNA-based storage of digital information. Digital information is being produced at an ever-growing rate, requiring an increasing commitment to ongoing maintenance of digital media in the archives. Surprisingly, this provides a niche for DNA, which can serve as a dense and stable information-storage medium. Nick Goldman et al . report an efficient and scalable strategy with robust error correction for encoding a record amount of information (including images, text and audio files) in DNA strands. After synthesizing a 'DNA archive' and shipping it from California to Germany, the DNA was sequenced and the information read. At the current rate of DNA synthesis cost reduction, DNA-based information storage is expected to become cost effective within a decade for archives likely to be accessed only rarely, after about 50 years. Digital production, transmission and storage have revolutionized how we access and use information but have also made archiving an increasingly complex task that requires active, continuing maintenance of digital media. This challenge has focused some interest on DNA as an attractive target for information storage 1 because of its capacity for high-density information encoding, longevity under easily achieved conditions 2 , 3 , 4 and proven track record as an information bearer. Previous DNA-based information storage approaches have encoded only trivial amounts of information 5 , 6 , 7 or were not amenable to scaling-up 8 , and used no robust error-correction and lacked examination of their cost-efficiency for large-scale information archival 9 . Here we describe a scalable method that can reliably store more information than has been handled before. We encoded computer files totalling 739 kilobytes of hard-disk storage and with an estimated Shannon information 10 of 5.2 × 10 6 bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy. Theoretical analysis indicates that our DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving. In fact, current trends in technological advances are reducing DNA synthesis costs at a pace that should make our scheme cost-effective for sub-50-year archiving within a decade.

Journal Article

Share this book

Add to My Shelf

Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution

by Stergachis, Andrew B. , Fu, Wenqing , Akey, Joshua M. in amino acid sequences , Amino acids , Binding

2013

Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exorne in 81 diverse cell types. We found that -15% of human codons are dual-use codons (\"duons\") that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.

Journal Article

Share this book

Add to My Shelf

Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing

by LeProust, Emily M , Brockman, William , Gnirke, Andreas in Agriculture , Baits , Base Composition - genetics

2009

Gnirke et al . present a bead-based method for capturing sequences of interest in the human genome for massively parallel sequencing. Using long, biotinylated RNA probes to pull down PCR-amplified DNA fragments, they demonstrate sequencing of 2.5 Mbs of exons in 1,900 genes. Targeting genomic loci by massively parallel sequencing requires new methods to enrich templates to be sequenced. We developed a capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments. The RNA is transcribed from PCR-amplified oligodeoxynucleotides originally synthesized on a microarray, generating sufficient bait for multiple captures at concentrations high enough to drive the hybridization. We tested this method with 170-mer baits that target >15,000 coding exons (2.5 Mb) and four regions (1.7 Mb total) using Illumina sequencing as read-out. About 90% of uniquely aligning bases fell on or near bait sequence; up to 50% lay on exons proper. The uniformity was such that ∼60% of target bases in the exonic 'catch', and ∼80% in the regional catch, had at least half the mean coverage. One lane of Illumina sequence was sufficient to call high-confidence genotypes for 89% of the targeted exon space.

Journal Article

Share this book

Add to My Shelf

Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips

by LeProust, Emily M , Super, Michael , Way, Jeffrey in 631/1647/1888/1890 , 631/1647/2017/2079 , 631/92/95

2010

Long DNA molecules, such as those encoding genes, can be assembled from short oligonucleotides created on a microarray. Kosuri et al . improve the fidelity and scalability of this process, enabling synthesis of 40 antibody fragments having repetitive regions and other challenging sequence features. Development of cheap, high-throughput and reliable gene synthesis methods will broadly stimulate progress in biology and biotechnology 1 . Currently, the reliance on column-synthesized oligonucleotides as a source of DNA limits further cost reductions in gene synthesis 2 . Oligonucleotides from DNA microchips can reduce costs by at least an order of magnitude 3 , 4 , 5 , yet efforts to scale their use have been largely unsuccessful owing to the high error rates and complexity of the oligonucleotide mixtures. Here we use high-fidelity DNA microchips, selective oligonucleotide pool amplification, optimized gene assembly protocols and enzymatic error correction to develop a method for highly parallel gene synthesis. We tested our approach by assembling 47 genes, including 42 challenging therapeutic antibody sequences, encoding a total of ∼35 kilobase pairs of DNA. These assemblies were performed from a complex background containing 13,000 oligonucleotides encoding ∼2.5 megabases of DNA, which is at least 50 times larger than in previously published attempts.

Journal Article

Share this book

Add to My Shelf

Structure-guided SCHEMA recombination generates diverse chimeric channelrhodopsins

by Bedbrook, Claire N. , Arnold, Frances H. , Chen, Siyuan in Biochemistry , Biological Sciences , Cells

2017

Integral membrane proteins (MPs) are key engineering targets due to their critical roles in regulating cell function. In engineering MPs, it can be extremely challenging to retain membrane localization capability while changing other desired properties. We have used structure-guided SCHEMA recombination to create a large set of functionally diverse chimeras from three sequence-diverse channelrhodopsins (ChRs). We chose 218 ChR chimeras from two SCHEMA libraries and assayed them for expression and plasma membrane localization in human embryonic kidney cells. The majority of the chimeras express, with 89% of the tested chimeras outperforming the lowest-expressing parent; 12% of the tested chimeras express at even higher levels than any of the parents. A significant fraction (23%) also localize to the membrane better than the lowest-performing parent ChR. Most (93%) of these welllocalizing chimeras are also functional light-gated channels. Many chimeras have stronger light-activated inward currents than the three parents, and some have unique off-kinetics and spectral properties relative to the parents. An effective method for generating protein sequence and functional diversity, SCHEMA recombination can be used to gain insights into sequence–function relationships in MPs.

Journal Article

Share this book

Add to My Shelf

Next Steps for Access to Safe, Secure DNA Synthesis

by Diggans, James , Leproust, Emily in Bioengineering and Biotechnology , biosecurity , cyberbiosecurity

2019

The DNA synthesis industry has, since the invention of gene-length synthesis, worked proactively to ensure synthesis is carried out securely and safely. Informed by guidance from the U.S. government, several of these companies have collaborated over the last decade to produce a set of best practices for customer and sequence screening prior to manufacture. Taken together, these practices ensure that synthetic DNA is used to advance research that is designed and intended for public benefit. With increasing scale in the industry and expanding capability in the synthetic biology toolset, it is worth revisiting current practices to evaluate additional measures to ensure the continued safety and wide availability of DNA synthesis. Here we encourage specific steps, in part derived from successes in the cybersecurity community, that can ensure synthesis screening systems stay well ahead of emerging challenges, to continue to enable responsible research advances. Gene synthesis companies, science and technology funders, policymakers, and the scientific community as a whole have a shared duty to continue to minimize risk and maximize the safety and security of DNA synthesis to further power world-changing developments in advanced biological manufacturing, agriculture, drug development, healthcare, and energy.

Journal Article

Share this book

Add to My Shelf

The DNA-encoded nucleosome organization of a eukaryotic genome

by Kaplan, Noam , Hughes, Timothy R. , Tillo, Desiree in Animals , Base Sequence , Biological and medical sciences

2009

Organizing nucleosomes The nucleosomes are the basic repeating units of eukaryotic chromatin, and nucleosome organization is critically important for gene regulation. Kaplan et al . tested the importance of the intrinsic DNA sequence preferences of nucleosomes by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map is remarkably similar to in vivo nucleosome maps, indicating that the organization of nucleosomes in vivo is largely governed by the underlying genomic DNA sequence. This study tests the importance of the intrinsic DNA sequence preferences of nucleosomes by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map is similar to in vivo nucleosome maps, indicating that the organization of nucleosomes in vivo is largely governed by the underlying genomic DNA sequence. Nucleosome organization is critical for gene regulation 1 . In living cells this organization is determined by multiple factors, including the action of chromatin remodellers 2 , competition with site-specific DNA-binding proteins 3 , and the DNA sequence preferences of the nucleosomes themselves 4 , 5 , 6 , 7 , 8 . However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo 7 , 9 , 10 , 11 , because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro , nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for ∼40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans . Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo .

Journal Article

Share this book

Add to My Shelf

Autoantigen discovery with a synthetic human peptidome

by Gakidis, M Angelica Martinez , Solimini, Nicole L , Elledge, Stephen J in 631/45/475/2290 , 631/61/24 , 692/699/249/1313

2011

Larman et al . create a phage library containing >400,000 sequences encoding peptides that cover all open reading frames in the human genome. They then use this synthetic peptidome to discover novel autoantigens targeted by antibodies in the cerebrospinal fluid of individuals with a neurological autoimmune disease. Immune responses targeting self-proteins (autoantigens) can lead to a variety of autoimmune diseases. Identification of these antigens is important for both diagnostic and therapeutic reasons. However, current approaches to characterize autoantigens have, in most cases, met only with limited success. Here we present a synthetic representation of the complete human proteome, the T7 peptidome phage display library (T7-Pep), and demonstrate its application to autoantigen discovery. T7-Pep is composed of >413,000 36-residue, overlapping peptides that cover all open reading frames in the human genome, and can be analyzed using high-throughput DNA sequencing. We developed a phage immunoprecipitation sequencing (PhIP-Seq) methodology to identify known and previously unreported autoantibodies contained in the spinal fluid of three individuals with paraneoplastic neurological syndromes. We also show how T7-Pep can be used more generally to identify peptide-protein interactions, suggesting the broader utility of our approach for proteomic research.

Journal Article

Share this book

Add to My Shelf

Genome-Wide Identification of Human RNA Editing Sites by Parallel DNA Capturing and Sequencing

by Gao, Yuan , Levanon, Erez Y. , Xie, Bin in adenosine , Adenosine Deaminase - metabolism , Adrenal Glands - metabolism

2009

Adenosine-to-inosine (A-to-I) RNA editing leads to transcriptome diversity and is important for normal brain function. To date, only a handful of functional sites have been identified in mammals. We developed an unbiased assay to screen more than 36,000 computationally predicted nonrepetitive A-to-I sites using massively parallel target capture and DNA sequencing. A comprehensive set of several hundred human RNA editing sites was detected by comparing genomic DNA with RNAs from seven tissues of a single individual. Specificity of our profiling was supported by observations of enrichment with known features of targets of adenosine deaminases acting on RNA (ADAR) and validation by means of capillary sequencing. This efficient approach greatly expands the repertoire of RNA editing targets and can be applied to studies involving RNA editing-related human diseases.

Journal Article

Share this book

Add to My Shelf

Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming

by Shoemaker, Robert , Gore, Athurva , LeProust, Emily M in Agriculture , Base Sequence , Bioinformatics

2009

Although technically feasible, whole-genome analysis of cytosine methylation using bisulfite sequencing remains prohibitively expensive for large eukaryotic genomes. Deng et al . use 30,000 nondegenerate padlock probes to capture ∼66,000 bisulfite-converted sites in human CpG islands and compare their methylation in fibroblasts, embryonic stem cells and induced pluripotent stem cells. Current DNA methylation assays are limited in the flexibility and efficiency of characterizing a large number of genomic targets. We report a method to specifically capture an arbitrary subset of genomic targets for single-molecule bisulfite sequencing for digital quantification of DNA methylation at single-nucleotide resolution. A set of ~30,000 padlock probes was designed to assess methylation of ~66,000 CpG sites within 2,020 CpG islands on human chromosome 12, chromosome 20, and 34 selected regions. To investigate epigenetic differences associated with dedifferentiation, we compared methylation in three human fibroblast lines and eight human pluripotent stem cell lines. Chromosome-wide methylation patterns were similar among all lines studied, but cytosine methylation was slightly more prevalent in the pluripotent cells than in the fibroblasts. Induced pluripotent stem (iPS) cells appeared to display more methylation than embryonic stem cells. We found 288 regions methylated differently in fibroblasts and pluripotent cells. This targeted approach should be particularly useful for analyzing DNA methylation in large genomes.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter