Catalogue Search | MBRL

Benchmarking single-cell hashtag oligo demultiplexing methods

by Neeland, Melanie , Dawson, Mark A , Vassiliadis, Dane in Algorithms , Bioinformatics , Cells

2023

Sample multiplexing is often used to reduce cost and limit batch effects in single-cell RNA sequencing (scRNA-seq) experiments. A commonly used multiplexing technique involves tagging cells prior to pooling with a hashtag oligo (HTO) that can be sequenced along with the cells’ RNA to determine their sample of origin. Several tools have been developed to demultiplex HTO sequencing data and assign cells to samples. In this study, we critically assess the performance of seven HTO demultiplexing tools: hashedDrops, HTODemux, GMM-Demux, demuxmix, deMULTIplex, BFF (bimodal flexible fitting) and HashSolo. The comparison uses data sets where each sample has also been demultiplexed using genetic variants from the RNA, enabling comparison of HTO demultiplexing techniques against complementary data from the genetic ‘ground truth’. We find that all methods perform similarly where HTO labelling is of high quality, but methods that assume a bimodal count distribution perform poorly on lower quality data. We also suggest heuristic approaches for assessing the quality of HTO counts in an scRNA-seq experiment.

Journal Article

Share this book

Add to My Shelf

A comparison on predicting functional impact of genomic variants

by Wang, Edwin , Wang, Dong , Wang, Yadong in Accuracy , Bioinformatics , Computer applications

2022

ABSTRACT Single-nucleotide polymorphism (SNPs) may cause the diverse functional impact on RNA or protein changing genotype and phenotype, which may lead to common or complex diseases like cancers. Accurate prediction of the functional impact of SNPs is crucial to discover the ‘influential’ (deleterious, pathogenic, disease-causing, and predisposing) variants from massive background polymorphisms in the human genome. Increasing computational methods have been developed to predict the functional impact of variants. However, predictive performances of these computational methods on massive genomic variants are still unclear. In this regard, we systematically evaluated 14 important computational methods including specific methods for one type of variant and general methods for multiple types of variants from several aspects; none of these methods achieved excellent (AUC ≥ 0.9) performance in both data sets. CADD and REVEL achieved excellent performance on multiple types of variants and missense variants, respectively. This comparison aims to assist researchers and clinicians to select appropriate methods or develop better predictive methods.

Journal Article

Share this book

Add to My Shelf

Alternative splicing analysis benchmark with DICAST

by Fenn, Amit , Louadi, Zakaria , Kacprowski, Tim in Alternative splicing , Bioinformatics , Datasets

2023

Abstract Alternative splicing is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools have been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools, which arguably provide the most detailed insights into the alternative splicing process. DICAST offers a modular and extensible framework for analysing alternative splicing integrating eleven splice-aware mapping and eight event detection tools. We benchmark all tools extensively on simulated as well as whole blood RNA-seq data. STAR and HISAT2 demonstrated the best balance between performance and run time. The performance of event detection tools varies widely with no tool outperforming all others. DICAST allows researchers to employ a consensus approach to consider the most successful tools jointly for robust event detection. Furthermore, we propose the first reporting standard to unify existing formats and to guide future tool development.

Journal Article

Share this book

Add to My Shelf

Benchmarking unsupervised methods for inferring TCR specificity

by Gouge, Kenz Le , Klatzmann, David , Jouannet, Charline in Adaptive immunity , Algorithms , Antigens

2025

Abstract Identifying T-cell receptor (TCR) specificity is crucial for advancing the understanding of adaptive immunity. Despite the development of computational methods to infer TCR specificity, their clustering behavior has not been thoroughly compared. We addressed this by curating a unified database of 190 670 human TCRs with known specificities for 2313 epitopes across 121 organisms, combining data from IEDB, McPAS-TCR, and VDJdb. We asked whether widely used TCR clustering methods produce comparable results on the same high-confidence dataset. We hypothesized that shared assumptions about conserved CDR3 motifs would yield similar patterns, with differences reflecting algorithmic design. Nine methods for clustering TCRs based on similarity were benchmarked against this dataset. DeepTCR demonstrated the best retention, while ClusTCR, TCRMatch, and GLIPH2 excelled in cluster purity but had lower retention. GLIPH2, Levenshtein distance, Hamming distance, and clusTCR generated large clusters in contrast to TCRMatch and DeepTCR. Smaller, antigen-specific clusters were produced by GIANA and iSMART. DeepTCR was the most sensitive in capturing antigen-specific TCRs. We confirmed these observations using a larger dataset from 10X Genomics containing antigen-specific labeled TCRs as well non-labeled cells. This study offers a unified TCR database and a benchmark of specificity inference methods, guiding researchers in selecting appropriate tools. Graphical Abstract Graphical Abstract

Journal Article

Share this book

Add to My Shelf

Benchmarking of methods that identify alternative polyadenylation events in single-/multiple-polyadenylation site genes

by Tian, Qiuxiang , Zou, Quan , Jia, Linpei in 3' Untranslated regions , Algorithms , Benchmarking

2025

Abstract Alternative polyadenylation (APA) is a widespread post-transcriptional mechanism that diversifies gene expression by generating messenger RNA isoforms with varying 3′ untranslated regions. Accurate identification and quantification of transcriptome-wide polyadenylation site (PAS) usage are essential for understanding APA-mediated gene regulation and its biological implications. In this review, we first review the landscape of computational tools developed to identify APA events from RNA sequencing (RNA-seq) data. We then benchmarked five PAS prediction tools and seven APA detection algorithms using five RNA-seq datasets derived from clear cell renal cell carcinoma (ccRCC) and adjacent normal tissues. By evaluating tool performance across genes with either single or multiple PASs, we revealed substantial variation in accuracy, sensitivity, and consistency among the tools. Based on this comparative analysis, we offer practical guidelines for tool selection and propose considerations for improving APA detection accuracy. Additionally, our analysis identified CCNL2 as a candidate gene exhibiting significant APA regulation in ccRCC, highlighting its potential as a disease-associated biomarker.

Journal Article

Share this book

Add to My Shelf

Benchmarking computational methods for B-cell receptor reconstruction from single-cell RNA-seq data

by Slot, Linda M , Olfati-Saber, Reza , Andreani, Tommaso in Antibodies , Antigens , Autoimmune diseases

2022

Abstract Multiple methods have recently been developed to reconstruct full-length B-cell receptors (BCRs) from single-cell RNA sequencing (scRNA-seq) data. This need emerged from the expansion of scRNA-seq techniques, the increasing interest in antibody-based drug development and the importance of BCR repertoire changes in cancer and autoimmune disease progression. However, a comprehensive assessment of performance-influencing factors such as the sequencing depth, read length or number of somatic hypermutations (SHMs) as well as guidance regarding the choice of methodology is still lacking. In this work, we evaluated the ability of six available methods to reconstruct full-length BCRs using one simulated and three experimental SMART-seq datasets. In addition, we validated that the BCRs assembled in silico recognize their intended targets when expressed as monoclonal antibodies. We observed that methods such as BALDR, BASIC and BRACER showed the best overall performance across the tested datasets and conditions, whereas only BASIC demonstrated acceptable results on very short read libraries. Furthermore, the de novo assembly-based methods BRACER and BALDR were the most accurate in reconstructing BCRs harboring different degrees of SHMs in the variable domain, while TRUST4, MiXCR and BASIC were the fastest. Finally, we propose guidelines to select the best method based on the given data characteristics.

Journal Article

Share this book

Add to My Shelf

State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction

by Bernard, Clément , Postic, Guillaume , Ghannay, Sahar in Bioinformatics , Computer applications , Datasets

2024

RNAs are essential molecules involved in numerous biological functions. Understanding RNA functions requires the knowledge of their 3D structures. Computational methods have been developed for over two decades to predict the 3D conformations from RNA sequences. These computational methods have been widely used and are usually categorised as either ab initio or template-based. The performances remain to be improved. Recently, the rise of deep learning has changed the sight of novel approaches. Deep learning methods are promising, but their adaptation to RNA 3D structure prediction remains difficult. In this paper, we give a brief review of the ab initio, template-based and novel deep learning approaches. We highlight the different available tools and provide a benchmark on nine methods using the RNA-Puzzles dataset. We provide an online dashboard that shows the predictions made by benchmarked methods, freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/state_of_the_rnart/.

Journal Article

Share this book

Add to My Shelf

Egocentric social networks and social interactions in the Greater Tokyo Area

by Harata Noboru , Takami Kiyoshi , Parady Giancarlos in Benchmark tests , Benchmarks , Interpersonal communication

2021

This article presents the results of a survey on egocentric social networks in the Greater Tokyo Area. This is, together with our preliminary study, the first study on egocentric social network in Japan that uses an unrestricted name generator to elicit personal networks. It is comparable to previous work conducted in Europe (Switzerland and The Netherlands) and The Americas (Canada and Chile). In addition to a thorough description of the survey design and execution process, basic results regarding network characteristics and social interaction patterns, and estimation results of a multilevel multivariate mixed effect model of social contact frequency by mode are presented and compared against relevant benchmark data. The information provided in this article and the supplementary documents will allow its use as a new benchmark study in the subject of social networks and social interactions research.

Journal Article

Share this book

Add to My Shelf

A Survey of Visual SLAM Based on RGB-D Images Using Deep Learning and Comparative Study for VOE

by Nguyen, Thi-Ha-Phuong , Le, Van-Hung in Agricultural production , Annotations , Benchmarks

2025

Visual simultaneous localization and mapping (Visual SLAM) based on RGB-D image data includes two main tasks: One is to build an environment map, and the other is to simultaneously track the position and movement of visual odometry estimation (VOE). Visual SLAM and VOE are used in many applications, such as robot systems, autonomous mobile robots, assistance systems for the blind, human–machine interaction, industry, etc. To solve the computer vision problems in Visual SLAM and VOE from RGB-D images, deep learning (DL) is an approach that gives very convincing results. This manuscript examines the results, advantages, difficulties, and challenges of the problem of Visual SLAM and VOE based on DL. In this paper, the taxonomy is proposed to conduct a complete survey based on three methods to construct Visual SLAM and VOE from RGB-D images (1) using DL for the modules of the Visual SLAM and VOE systems; (2) using DL to supplement the modules of Visual SLAM and VOE systems; and (3) using end-to-end DL to build Visual SLAM and VOE systems. The 220 scientific publications on Visual SLAM, VOE, and related issues were surveyed. The studies were surveyed based on the order of methods, datasets, evaluation measures, and detailed results. In particular, studies on using DL to build Visual SLAM and VOE systems have analyzed the challenges, advantages, and disadvantages. We also proposed and published the TQU-SLAM benchmark dataset, and a comparative study on fine-tuning the VOE model using a Multi-Layer Fusion network (MLF-VO) framework was performed. The comparison results of VOE on the TQU-SLAM benchmark dataset range from 16.97 m to 57.61 m. This is a huge error compared to the VOE methods on the KITTI, TUM RGB-D SLAM, and ICL-NUIM datasets. Therefore, the dataset we publish is very challenging, especially in the opposite direction (OP-D) when collecting and annotation data. The results of the comparative study are also presented in detail and available.

Journal Article

Share this book

Add to My Shelf

Using Calibration Weighting to Adjust for Nonignorable Unit Nonresponse

by Kott, Phillip S. , Chang, Ted in Applications , Benchmark variable , Benchmarks

2010

When calibration weighting is be used to adjust for unit nonresponse in a sample survey, the response/nonresponse mechanism is often assumed to be a function of a set of covariates, which we call \"model variables.\" These model variables usually also serve as the benchmark variables in the calibration equation. In principle, however, the model variables do not have to coincide with the benchmark variables. Since the model-variable values need only be known for the respondents, this allows the treatment of what is usually considered nonignorable nonresponse in the prediction approach to survey sampling. One can invoke either a quasi-randomization or prediction approach to justify calibration weighting as a means for adjusting for nonresponse. Both frameworks rely on unverifiable model assumptions, and both require large samples to produce nearly unbiased estimators even when those assumptions hold. We will explore these issues theoretically using a joint framework and with an empirical study.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter