Catalogue Search | MBRL

Deep learning enables the atomic structure determination of the Fanconi Anemia core complex from cryoEM

by Lauko, Anna , Baker, David , Anishchenko, Ivan in Anemia , Artificial neural networks , Atomic structure

2020

Cryo-electron microscopy of protein complexes often leads to moderate resolution maps (4–8 Å), with visible secondary-structure elements but poorly resolved loops, making model building challenging. In the absence of high-resolution structures of homologues, only coarse-grained structural features are typically inferred from these maps, and it is often impossible to assign specific regions of density to individual protein subunits. This paper describes a new method for overcoming these difficulties that integrates predicted residue distance distributions from a deep-learned convolutional neural network, computational protein folding using Rosetta , and automated EM-map-guided complex assembly. We apply this method to a 4.6 Å resolution cryoEM map of Fanconi Anemia core complex (FAcc), an E3 ubiquitin ligase required for DNA interstrand crosslink repair, which was previously challenging to interpret as it comprises 6557 residues, only 1897 of which are covered by homology models. In the published model built from this map, only 387 residues could be assigned to the specific subunits with confidence. By building and placing into density 42 deep-learning-guided models containing 4795 residues not included in the previously published structure, we are able to determine an almost-complete atomic model of FAcc, in which 5182 of the 6557 residues were placed. The resulting model is consistent with previously published biochemical data, and facilitates interpretation of disease-related mutational data. We anticipate that our approach will be broadly useful for cryoEM structure determination of large complexes containing many subunits for which there are no homologues of known structure.

Journal Article

Share this book

Add to My Shelf

De novo design of protein structure and function with RFdiffusion

by Courbet, Alexis , Ragotte, Robert J. , Ovchinnikov, Sergey in 101/28 , 631/114/1305 , 631/114/469

2023

There has been considerable recent progress in designing new proteins using deep-learning methods 1 – 9 . Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models 10 , 11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence–structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications. Fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks yields a generative model for protein design that achieves outstanding performance on a wide range of protein structure and function design challenges.

Journal Article

Share this book

Add to My Shelf

Computational Design of Serine Hydrolases

by Lauko, Anna in Biochemistry , Bioengineering

2024

Nature’s enzymes are exceptionally powerful catalysts, exerting dramatic rate accelerations and exquisite control over a remarkable variety of chemical transformations. Since their initial discovery and characterization, the ability to generate artificial enzymes for chemical reactions involved in industrial processes, chemical synthesis, and therapeutic applications has been of considerable interest. Despite decades of effort, artificial enzymes continue to display lower catalytic activities than their native counterparts, even for well-understood model reactions. Here, we present a novel and general approach to computational enzyme design utilizing recent advances in tailored protein scaffold generation and active site conformational ensemble prediction. As a proof of concept, we have applied this method to the design of esterases that utilize the serine hydrolase enzymatic mechanism. Despite a deep understanding of the mechanism amassed through decades of study, previous attempts to design esterases acting through this mechanism have failed. To our knowledge, the designs made using our approach represent the first examples of accurately designed, de novo serine hydrolases spanning folds not found in natural hydrolases and exhibiting catalytic efficiencies on par with hydrolases in nature that act on similar substrates. We believe our approach will not only enable the design of industrially relevant serine hydrolases but also be broadly applicable to accelerating a wider array of chemical reactions, including ones that do not occur in nature.

Dissertation

Share this book

Add to My Shelf

Modeling protein-small molecule conformational ensembles with PLACER

by Lauko, Anna , Krishna, Rohith , Baker, David in Biochemistry

2025

Modeling the conformational heterogeneity of protein-small molecule interactions is important for understanding natural systems and evaluating designed systems, but remains an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called PLACER (Protein-Ligand Atomistic Conformational Ensemble Resolver) trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system. PLACER accurately generates structures of diverse organic small molecules given knowledge of their atom composition and bonding, and given a description of the larger protein context, builds up structures of small molecules and protein side chains for protein-small molecule docking. Because PLACER is rapid and stochastic, ensembles of predictions can be readily generated to map conformational heterogeneity. In enzyme design efforts described here and elsewhere, we find that using PLACER to assess the accuracy and pre-organization of the designed active sites results in higher success rates and higher activities; we obtain a preorganized retroaldolase with a / of 11000 M min , considerably higher than any pre-deep learning design for this reaction. We anticipate that PLACER will be widely useful for rapidly generating conformational ensembles of small molecule and small molecule-protein systems, and for designing higher activity preorganized enzymes.

Journal Article

Share this book

Add to My Shelf

Modeling protein-small molecule conformational ensembles with ChemNet

by Lauko, Anna , Krishna, Rohith , Baker, David

2024

Modeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called ChemNet trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system. ChemNet accurately generates structures of diverse organic small molecules given knowledge of their atom composition and bonding, and given a description of the larger protein context, and builds up structures of small molecules and protein side chains for protein-small molecule docking. Because ChemNet is rapid and stochastic, ensembles of predictions can be readily generated to map conformational heterogeneity. In enzyme design efforts described here and elsewhere, we find that using ChemNet to assess the accuracy and pre-organization of the designed active sites results in higher success rates and higher activities; we obtain a preorganized retroaldolase with a k cat/K M of 11000 M-1min-1, considerably higher than any pre-deep learning design for this reaction. We anticipate that ChemNet will be widely useful for rapidly generating conformational ensembles of small molecule and small molecule-protein systems, and for designing higher activity preorganized enzymes.Modeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called ChemNet trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system. ChemNet accurately generates structures of diverse organic small molecules given knowledge of their atom composition and bonding, and given a description of the larger protein context, and builds up structures of small molecules and protein side chains for protein-small molecule docking. Because ChemNet is rapid and stochastic, ensembles of predictions can be readily generated to map conformational heterogeneity. In enzyme design efforts described here and elsewhere, we find that using ChemNet to assess the accuracy and pre-organization of the designed active sites results in higher success rates and higher activities; we obtain a preorganized retroaldolase with a k cat/K M of 11000 M-1min-1, considerably higher than any pre-deep learning design for this reaction. We anticipate that ChemNet will be widely useful for rapidly generating conformational ensembles of small molecule and small molecule-protein systems, and for designing higher activity preorganized enzymes.

Journal Article

Share this book

Add to My Shelf

Computational design of serine hydrolases

by Jamieson, Cooper , Baker, David , Norn, Christoffer in Biochemistry

2024

Enzymes that proceed through multistep reaction mechanisms often utilize complex, polar active sites positioned with sub-angstrom precision to mediate distinct chemical steps, which makes their de novo construction extremely challenging. We sought to overcome this challenge using the classic catalytic triad and oxyanion hole of serine hydrolases as a model system. We used RFdiffusion to generate proteins housing catalytic sites of increasing complexity and varying geometry, and a newly developed ensemble generation method called ChemNet to assess active site geometry and preorganization at each step of the reaction. Experimental characterization revealed novel serine hydrolases that catalyze ester hydrolysis with catalytic efficiencies ( / ) up to 3.8 x 10 M s , closely match the design models (Cα RMSDs < 1 Å), and have folds distinct from natural serine hydrolases. In silico selection of designs based on active site preorganization across the reaction coordinate considerably increased success rates, enabling identification of new catalysts in screens of as few as 20 designs. Our de novo buildup approach provides insight into the geometric determinants of catalysis that complements what can be obtained from structural and mutational studies of native enzymes (in which catalytic group geometry and active site makeup cannot be so systematically varied), and provides a roadmap for the design of industrially relevant serine hydrolases and, more generally, for designing complex enzymes that catalyze multi-step transformations.

Journal Article

Share this book

Add to My Shelf

Comparison of OPLC and other chromatographic methods (TLC, HPLC, and GC) for in-process purity testing of nandrolone

by Laukó, Anna , Mezei, Mária , Bagócsi, Boglárka in Acetic acid , Analysis , Biological and medical sciences

2002

SummaryA semiquantitative OPLC purity test has been developed for in-process control of nandrolone and compared with other chromato-graphic methods. TLC was not sufficiently selective, a key impurity with low UV absorption could not be detected by HPLC, and nan-drolone was slightly degraded during gas chromatography. OPLC proved to be a suitable means of testing for all potential impurities in nandrolone. The separation was performed by multiple development on fine-particle silica gel with cyclohexane—ethyl acetate—chloroform, 50 + 25 + 25 (v/v), as mobile phase. After spraying the chro-matograms with sulfuric acid, then heating, the impurities could be sensitively detected by visual inspection in long-wave UV light. Detection limits were ≤0.01 μg.

Journal Article

Share this book

Add to My Shelf

Computational Design of Metallohydrolases

by Salike, Saman , Lauko, Anna , Kim, Donghyo in Amino acids , Biochemistry , Design

2024

New enzymes can be designed by starting from a description of an ideal active site composed of catalytic residues surrounding the reaction transition state(s) and identifying or generating a protein scaffold that supports the site, but there are a few current limitations. First, the catalytic efficiencies achieved by such efforts have generally been quite low, and considerable optimization by directed evolution has been required to reach activities typical of native enzymes. Second, generative AI methods such as RFdiffusion now enable the direct generation of proteins around active sites, but to date, such scaffolding has required specification of both the position in the sequence and the backbone coordinates of the catalytic residue, which complicates sampling. Here we introduce a generative AI method called RFflow that overcomes these limitations and use it to design zinc metallohydrolases starting from a density functional theory description of active site geometry. Of 96 designs tested experimentally, the most active has a kcat/KM of 23,000 M-1 s-1, orders of magnitude higher than previously designed metallohydrolases. This 148 amino acid protein has a novel fold with an enclosed chamber which positions the reaction substrate nearly perfectly for attack by a catalytic water molecule activated by the bound metal and is predicted by ChemNet to have a highly preorganized active site. The ability to generate high activity catalysts starting from quantum chemistry calculated active site geometries without experimental optimization should open the door to a new generation of potent designer enzymes.Competing Interest StatementThe authors have declared no competing interest.

Paper

Share this book

Add to My Shelf

Deep learning enables the atomic structure determination of the Fanconi Anemia core complex from cryoEM

by Lauko, Anna , Dimaio, Frank , Passmore, Lori A in Anemia , Biochemistry , Computer applications

2020

Cryo-electron microscopy of protein complexes often leads to moderate resolution maps (4-8 Å), with visible secondary structure elements but poorly resolved loops, making model-building challenging. In the absence of high-resolution structures of homologues, only coarse-grained structural features are typically inferred from these maps, and it is often impossible to assign specific regions of density to individual protein subunits. This paper describes a new method for overcoming these difficulties that integrates predicted residue distance distributions from a deep-learned convolutional neural network, computational protein folding using Rosetta, and automated EM-map-guided complex assembly. We apply this method to a 4.6 Å resolution cryoEM map of Fanconi Anemia core complex (FAcc), an E3 ubiquitin ligase required for DNA interstrand crosslink repair, which was previously challenging to interpret as it is comprised of 6557 residues, only 1897 of which are covered by homology models. In the published structure built from this map, only 387 residues could be assigned to specific subunits. By building and placing into density 42 deep-learning guided models containing 4795 residues not included in the previously published structure, we are able to determine an almost-complete atomic model of FAcc, in which 5182 of the 6557 residues were placed. The resulting model is consistent with previously published biochemical data, and facilitates interpretation of disease related mutational data. We anticipate that our approach will be broadly useful for cryoEM structure determination of large complexes containing many subunits for which there are no homologues of known structure. Competing Interest Statement The authors have declared no competing interest.

Paper

Share this book

Add to My Shelf

Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models

by Courbet, Alexis , Venkatesh, Preetham , Bennett, Nathaniel R in Amino acid sequence , Biochemistry , Deep learning

2022

There has been considerable recent progress in designing new proteins using deep learning methods. Despite this progress, a general deep learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher order symmetric architectures, has yet to be described. Diffusion models have had considerable success in image and language generative modeling but limited success when applied to protein modeling, likely due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding, and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold Diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of new designs. In a manner analogous to networks which produce images from user-specified inputs, RFdiffusion enables the design of diverse, complex, functional proteins from simple molecular specifications.Competing Interest StatementThe authors have declared no competing interest.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter