Catalogue Search | MBRL

Fast and accurate protein structure search with Foldseek

by Kim, Stephanie S. , Tumescheit, Charlotte , Steinegger, Martin in 631/114 , 631/114/794 , 631/535

2024

As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively. Foldseek speeds up protein structural search by four to five orders of magnitude.

Journal Article

Share this book

Add to My Shelf

Best practices for single-cell analysis across modalities

in Chromatin , Electrons , Transcriptomics

2023

Recent advances in single-cell technologies have enabled high-throughput molecular profiling of cells across modalities and locations. Single-cell transcriptomics data can now be complemented by chromatin accessibility, surface protein expression, adaptive immune receptor repertoire profiling and spatial information. The increasing availability of single-cell data across modalities has motivated the development of novel computational methods to help analysts derive biological insights. As the field grows, it becomes increasingly difficult to navigate the vast landscape of tools and analysis steps. Here, we summarize independent benchmarking studies of unimodal and multimodal single-cell analysis across modalities to suggest comprehensive best-practice workflows for the most common analysis steps. Where independent benchmarks are not available, we review and contrast popular methods. Our article serves as an entry point for novices in the field of single-cell (multi-)omic analysis and guides advanced users to the most recent best practices.Practitioners in the field of single-cell omics are now faced with diverse options for analytical tools to process and integrate data from various molecular modalities. In an Expert Recommendation article, the authors provide guidance on robust single-cell data analysis, including choices of best-performing tools from benchmarking studies.

Journal Article

Share this book

Add to My Shelf

CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning

by Parks, Donovan H. , Tyson, Gene W. , Woodcroft, Ben J. in 631/114/1305 , 631/114/2785 , 631/114/794

2023

Advances in sequencing technologies and bioinformatics tools have dramatically increased the recovery rate of microbial genomes from metagenomic data. Assessing the quality of metagenome-assembled genomes (MAGs) is a critical step before downstream analysis. Here, we present CheckM2, an improved method of predicting genome quality of MAGs using machine learning. Using synthetic and experimental data, we demonstrate that CheckM2 outperforms existing tools in both accuracy and computational speed. In addition, CheckM2’s database can be rapidly updated with new high-quality reference genomes, including taxa represented only by a single genome. We also show that CheckM2 accurately predicts genome quality for MAGs from novel lineages, even for those with reduced genome size (for example, Patescibacteria and the DPANN superphylum). CheckM2 provides accurate genome quality predictions across bacterial and archaeal lineages, giving increased confidence when inferring biological conclusions from MAGs. This work presents CheckM2, which is a machine learning-based tool to predict genome quality of isolate, single-cell and metagenome-assembled genomes.

Journal Article

Share this book

Add to My Shelf

Museum of spatial transcriptomics

by Moses, Lambda , Pachter, Lior in 631/114/794 , 631/1647/2017/1947 , 631/1647/514/1949

2022

The function of many biological systems, such as embryos, liver lobules, intestinal villi, and tumors, depends on the spatial organization of their cells. In the past decade, high-throughput technologies have been developed to quantify gene expression in space, and computational methods have been developed that leverage spatial gene expression data to identify genes with spatial patterns and to delineate neighborhoods within tissues. To comprehensively document spatial gene expression technologies and data-analysis methods, we present a curated review of literature on spatial transcriptomics dating back to 1987, along with a thorough analysis of trends in the field, such as usage of experimental techniques, species, tissues studied, and computational approaches used. Our Review places current methods in a historical context, and we derive insights about the field that can guide current research strategies. A companion supplement offers a more detailed look at the technologies and methods analyzed: https://pachterlab.github.io/LP_2021/ . This work presents an overview of the evolution of spatial transcriptomics and highlights recent efforts in method developments in this space.

Journal Article

Share this book

Add to My Shelf

Visualizing and interpreting cancer genomics data via the Xena platform

by Zhu Jingchun , Kristupas, Repečka , McDade, Fran

2020

Journal Article

Share this book

Add to My Shelf

ColabFold: making protein folding accessible to all

by Moriwaki, Yoshitaka , Schütze, Konstantin , Ovchinnikov, Sergey in 631/114/129/2044 , 631/114/2397 , 631/114/2411

2022

ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold’s 40−60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com . ColabFold is a free and accessible platform for protein folding that provides accelerated prediction of protein structures and complexes using AlphaFold2 or RoseTTAFold.

Journal Article

Share this book

Add to My Shelf

Multi-omics single-cell data integration and regulatory inference with graph-linked embedding

by Gao, Ge , Cao, Zhi-Jie in 631/114/1305 , 631/114/2401 , 631/114/794

2022

Despite the emergence of experimental methods for simultaneous measurement of multiple omics modalities in single cells, most single-cell datasets include only one modality. A major obstacle in integrating omics data from multiple modalities is that different omics layers typically have distinct feature spaces. Here, we propose a computational framework called GLUE (graph-linked unified embedding), which bridges the gap by modeling regulatory interactions across omics layers explicitly. Systematic benchmarking demonstrated that GLUE is more accurate, robust and scalable than state-of-the-art tools for heterogeneous single-cell multi-omics data. We applied GLUE to various challenging tasks, including triple-omics integration, integrative regulatory inference and multi-omics human cell atlas construction over millions of cells, where GLUE was able to correct previous annotations. GLUE features a modular design that can be flexibly extended and enhanced for new analysis tasks. The full package is available online at https://github.com/gao-lab/GLUE . Different single-cell data modalities are integrated at atlas-scale by modeling regulatory interactions.

Journal Article

Share this book

Add to My Shelf

Haplotype-resolved assembly of diploid genomes without parental data

by Jarvis, Erich D. , Fedrigo, Olivier , Gemmell, Neil J. in 631/114/2785/2302 , 631/114/794 , Agriculture

2022

Routine haplotype-resolved genome assembly from single samples remains an unresolved problem. Here we describe an algorithm that combines PacBio HiFi reads and Hi-C chromatin interaction data to produce a haplotype-resolved assembly without the sequencing of parents. Applied to human and other vertebrate samples, our algorithm consistently outperforms existing single-sample assembly pipelines and generates assemblies of similar quality to the best pedigree-based assemblies. Haplotype-resolved genome assemblies are generated by combining HiFi reads with Hi-C long-range interactions.

Journal Article

Share this book

Add to My Shelf

Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm

by Concepcion, Gregory T , Cheng Haoyu , Li, Heng in Accuracy , Algorithms , Assemblies

2021

Haplotype-resolved de novo assembly is the ultimate solution to the study of sequence variations in a genome. However, existing algorithms either collapse heterozygous alleles into one consensus copy or fail to cleanly separate the haplotypes to produce high-quality phased assemblies. Here we describe hifiasm, a de novo assembler that takes advantage of long high-fidelity sequence reads to faithfully represent the haplotype information in a phased assembly graph. Unlike other graph-based assemblers that only aim to maintain the contiguity of one haplotype, hifiasm strives to preserve the contiguity of all haplotypes. This feature enables the development of a graph trio binning algorithm that greatly advances over standard trio binning. On three human and five nonhuman datasets, including California redwood with a ~30-Gb hexaploid genome, we show that hifiasm frequently delivers better assemblies than existing tools and consistently outperforms others on haplotype-resolved assembly.Hifiasm is a haplotype-resolved de novo genome assembler for long-read high-fidelity sequencing data based on phased assembly graphs.

Journal Article

Share this book

Add to My Shelf

A Python library for probabilistic analysis of single-cell omics data

by Streets, Aaron , Nazaret, Achille , Svensson, Valentine in 631/114/1305 , 631/114/2397 , 631/114/2415

2022

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter