Catalogue Search | MBRL

LIVECell—A large-scale dataset for label-free live cell segmentation

by Edlund Christoffer , Dale, Timothy , Khalid Nabeel in Artificial neural networks , Benchmarks , Cell culture

2021

Light microscopy combined with well-established protocols of two-dimensional cell culture facilitates high-throughput quantitative imaging to study biological phenomena. Accurate segmentation of individual cells in images enables exploration of complex biological questions, but can require sophisticated imaging processing pipelines in cases of low contrast and high object density. Deep learning-based methods are considered state-of-the-art for image segmentation but typically require vast amounts of annotated data, for which there is no suitable resource available in the field of label-free cellular imaging. Here, we present LIVECell, a large, high-quality, manually annotated and expert-validated dataset of phase-contrast images, consisting of over 1.6 million cells from a diverse set of cell morphologies and culture densities. To further demonstrate its use, we train convolutional neural network-based models using LIVECell and evaluate model segmentation accuracy with a proposed a suite of benchmarks.The LIVECell dataset comprises annotated phase-contrast images of over 1.6 million cells from different cell lines during growth from sparse seeding to confluence for improved training of deep learning-based models of image segmentation.

Journal Article

Share this book

Add to My Shelf

Haralick texture features from apparent diffusion coefficient (ADC) MRI images depend on imaging and pre-processing parameters

by Brynolfsson, Patrik , Nilsson, David , Karlsson, Camilla Thellenberg in 59/57 , 631/114/1564 , 631/67/2321

2017

In recent years, texture analysis of medical images has become increasingly popular in studies investigating diagnosis, classification and treatment response assessment of cancerous disease. Despite numerous applications in oncology and medical imaging in general, there is no consensus regarding texture analysis workflow, or reporting of parameter settings crucial for replication of results. The aim of this study was to assess how sensitive Haralick texture features of apparent diffusion coefficient (ADC) MR images are to changes in five parameters related to image acquisition and pre-processing: noise, resolution, how the ADC map is constructed, the choice of quantization method, and the number of gray levels in the quantized image. We found that noise, resolution, choice of quantization method and the number of gray levels in the quantized images had a significant influence on most texture features, and that the effect size varied between different features. Different methods for constructing the ADC maps did not have an impact on any texture feature. Based on our results, we recommend using images with similar resolutions and noise levels, using one quantization method, and the same number of gray levels in all quantized images, to make meaningful comparisons of texture feature results between different subjects.

Journal Article

Share this book

Add to My Shelf

Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models

by Galindo-Prieto, Beatriz , Geladi, Paul , Trygg, Johan in Algorithms , Analysis , Bioinformatics

2021

Background For multivariate data analysis involving only two input matrices (e.g., X and Y), the previously published methods for variable influence on projection (e.g., VIP OPLS or VIP O2PLS ) are widely used for variable selection purposes, including (i) variable importance assessment, (ii) dimensionality reduction of big data and (iii) interpretation enhancement of PLS, OPLS and O2PLS models. For multiblock analysis, the OnPLS models find relationships among multiple data matrices (more than two blocks) by calculating latent variables; however, a method for improving the interpretation of these latent variables (model components) by assessing the importance of the input variables was not available up to now. Results A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three datasets: a synthetic four-block dataset, a real three-block omics dataset related to plant sciences, and a real six-block dataset related to the food industry. Conclusions We provide evidence for the usefulness and reliability of MB-VIOP by means of three examples (one synthetic and two real-world cases). MB-VIOP assesses in a trustable and efficient way the importance of both isolated and ranges of variables in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.

Journal Article

Share this book

Add to My Shelf

Statistical analysis in metabolic phenotyping

by Blaise, Benjamin J. , Pearce, Jake T. M. , Holmes, Elaine in 631/114/2415 , 631/114/794 , 631/45/320

2021

Metabolic phenotyping is an important tool in translational biomedical research. The advanced analytical technologies commonly used for phenotyping, including mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy, generate complex data requiring tailored statistical analysis methods. Detailed protocols have been published for data acquisition by liquid NMR, solid-state NMR, ultra-performance liquid chromatography (LC-)MS and gas chromatography (GC-)MS on biofluids or tissues and their preprocessing. Here we propose an efficient protocol (guidelines and software) for statistical analysis of metabolic data generated by these methods. Code for all steps is provided, and no prior coding skill is necessary. We offer efficient solutions for the different steps required within the complete phenotyping data analytics workflow: scaling, normalization, outlier detection, multivariate analysis to explore and model study-related effects, selection of candidate biomarkers, validation, multiple testing correction and performance evaluation of statistical models. We also provide a statistical power calculation algorithm and safeguards to ensure robust and meaningful experimental designs that deliver reliable results. We exemplify the protocol with a two-group classification study and data from an epidemiological cohort; however, the protocol can be easily modified to cover a wider range of experimental designs or incorporate different modeling approaches. This protocol describes a minimal set of analyses needed to rigorously investigate typical datasets encountered in metabolic phenotyping. Metabolomics studies using large-scale NMR or mass spectrometry experiments on biofluids or tissues generate complex data. This protocol provides guidelines and software (supplied in Jupyter notebooks) for the statistical analysis of these data.

Journal Article

Share this book

Add to My Shelf

Metabolic Profiling of Systemic Lupus Erythematosus and Comparison with Primary Sjögren’s Syndrome and Systemic Sclerosis

by Moritz, Thomas , Torell, Frida , Surowiec, Izabella in Adult , Amino acids , Autoimmune diseases

2016

Systemic lupus erythematosus (SLE) is a chronic inflammatory autoimmune disease which can affect most organ systems including skin, joints and the kidney. Clinically, SLE is a heterogeneous disease and shares features of several other rheumatic diseases, in particular primary Sjögrens syndrome (pSS) and systemic sclerosis (SSc), why it is difficult to diagnose The pathogenesis of SLE is not completely understood, partly due to the heterogeneity of the disease. This study demonstrates that metabolomics can be used as a tool for improved diagnosis of SLE compared to other similar autoimmune diseases. We observed differences in metabolic profiles with a classification specificity above 67% in the comparison of SLE with pSS, SSc and a matched group of healthy individuals. Selected metabolites were also significantly different between studied diseases. Biochemical pathway analysis was conducted to gain understanding of underlying pathways involved in the SLE pathogenesis. We found an increased oxidative activity in SLE, supported by increased xanthine oxidase activity and an increased turnover in the urea cycle. The most discriminatory metabolite observed was tryptophan, with decreased levels in SLE patients compared to control groups. Changes of tryptophan levels were related to changes in the activity of the aromatic amino acid decarboxylase (AADC) and/or to activation of the kynurenine pathway.

Journal Article

Share this book

Add to My Shelf

Metabolite and Lipid Profiling of Biobank Plasma Samples Collected Prior to Onset of Rheumatoid Arthritis

by Ärlestig, Lisbeth , Surowiec, Izabella , Rantapää-Dahlqvist, Solbritt in Adult , Antiarthritic agents , Area Under Curve

2016

The early diagnosis of rheumatoid arthritis (RA) is desirable to install treatment to prevent disease progression and joint destruction. Autoantibodies and immunological markers pre-date the onset of symptoms by years albeit not all patients will present these factors, even at disease onset. Additional biomarkers would be of high value to improve early diagnosis and understanding of the process, leading to disease development. Plasma samples donated before the onset of RA were identified in the Biobank of Northern Sweden, a collection within national health survey programs. Thirty samples from pre-symptomatic individuals and nineteen from controls were subjected to liquid chromatography-mass spectrometry (LCMS) metabolite and lipid profiling. Lipid and metabolite profiles discriminating samples from pre-symptomatic individuals from controls were identified after univariate and multivariate OPLS-DA based analyses. The OPLS-DA models including pre-symptomatic individuals and controls identified profiles differentiating between the groups that was characterized by lower levels of acyl-carnitines and fatty acids, with higher levels of lysophospatidylcholines (LPCs) and metabolites from tryptophan metabolism in pre-symptomatic individuals compared with controls. Lipid profiling showed that the majority of phospholipids and sphingomyelins were at higher levels in pre-symptomatic individuals in comparison with controls. Our LCMS based approach demonstrated that there are changes in small molecule and lipid profiles detectable in plasma samples collected from the pre-symptomatic individuals who subsequently developed RA, which point to an up-regulation of levels of lysophospatidylcholines, and of tryptophan metabolism, perturbation of fatty acid beta-oxidation and increased oxidative stress in pre-symptomatic individuals' years before onset of symptoms.

Journal Article

Share this book

Add to My Shelf

Serine Protease Inhibitors Restrict Host Susceptibility to SARS-CoV-2 Infections

by Forsell, Mattias , Lenman, Annasara , Das, Debojyoti in A1AT , alpha 1-Antitrypsin , antithrombin III

2022

Identification of host factors affecting individual SARS-CoV-2 susceptibility will provide a better understanding of the large variations in disease severity and will identify potential factors that can be used, or targeted, in antiviral drug development. With the use of an advanced lung cell model established from several human donors, we identified cellular protease inhibitors, serpins, as host factors that restrict SARS-CoV-2 infection. The coronavirus disease 2019, COVID-19, is a complex disease with a wide range of symptoms from asymptomatic infections to severe acute respiratory syndrome with lethal outcome. Individual factors such as age, sex, and comorbidities increase the risk for severe infections, but other aspects, such as genetic variations, are also likely to affect the susceptibility to SARS-CoV-2 infection and disease severity. Here, we used a human 3D lung cell model based on primary cells derived from multiple donors to identity host factors that regulate SARS-CoV-2 infection. With a transcriptomics-based approach, we found that less susceptible donors show a higher expression level of serine protease inhibitors SERPINA1, SERPINE1, and SERPINE2, identifying variation in cellular serpin levels as restricting host factors for SARS-CoV-2 infection. We pinpoint their antiviral mechanism of action to inhibition of the cellular serine protease, TMPRSS2, thereby preventing cleavage of the viral spike protein and TMPRSS2-mediated entry into the target cells. By means of single-cell RNA sequencing, we further locate the expression of the individual serpins to basal, ciliated, club, and goblet cells. Our results add to the importance of genetic variations as determinants for SARS-CoV-2 susceptibility and suggest that genetic deficiencies of cellular serpins might represent risk factors for severe COVID-19. Our study further highlights TMPRSS2 as a promising target for antiviral intervention and opens the door for the usage of locally administered serpins as a treatment against COVID-19. IMPORTANCE Identification of host factors affecting individual SARS-CoV-2 susceptibility will provide a better understanding of the large variations in disease severity and will identify potential factors that can be used, or targeted, in antiviral drug development. With the use of an advanced lung cell model established from several human donors, we identified cellular protease inhibitors, serpins, as host factors that restrict SARS-CoV-2 infection. The antiviral mechanism was found to be mediated by the inhibition of a serine protease, TMPRSS2, which results in a blockage of viral entry into target cells. Potential treatments with these serpins would not only reduce the overall viral burden in the patients, but also block the infection at an early time point, reducing the risk for the hyperactive immune response common in patients with severe COVID-19.

Journal Article

Share this book

Add to My Shelf

Metabolic profiling of zebrafish embryo development from blastula period to early larval stages

by Lundstedt-Enkel, Katrin , Rännar, Stefan , Bennett, Kate in Amino acids , Anatomy & physiology , Animals

2019

The zebrafish embryo is a popular model for drug screening, disease modelling and molecular genetics. In this study, samples were obtained from zebrafish at different developmental stages. The stages that were chosen were 3/4, 4/5, 24, 48, 72 and 96 hours post fertilization (hpf). Each sample included fifty embryos. The samples were analysed using gas chromatography time-of-flight mass spectrometry (GC-TOF-MS). Principle component analysis (PCA) was applied to get an overview of the data and orthogonal projection to latent structure discriminant analysis (OPLS-DA) was utilised to discriminate between the developmental stages. In this way, changes in metabolite profiles during vertebrate development could be identified. Using a GC-TOF-MS metabolomics approach it was found that nucleotides and metabolic fuel (glucose) were elevated at early stages of embryogenesis, whereas at later stages amino acids and intermediates in the Krebs cycle were abundant. This agrees with zebrafish developmental biology, as organs such as the liver and pancreas develop at later stages. Thus, metabolomics of zebrafish embryos offers a unique opportunity to investigate large scale changes in metabolic processes during important developmental stages in vertebrate development. In terms of stability of the metabolic profile and viability of the embryos, it was concluded at 72 hpf was a suitable time point for the use of zebrafish as a model system in numerous scientific applications.

Journal Article

Share this book

Add to My Shelf

BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction

by Dengel, Andreas , Ibrahim, Muhammad Ali , Asim, Muhammad Nabeel in Accuracy , Benchmarks , Correlation coefficient

2022

Background and objective:Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficiency mainly due to the high length variability of lncRNA sequences. Hence, the prime focus of the current study is to find optimal length trade-offs between highly flexible length lncRNA sequences.MethodThe paper at hand performs in-depth exploration of diverse copy padding, sequence truncation approaches, and presents a novel idea of utilizing only subregions of lncRNA sequences to generate fixed-length lncRNA sequences. Furthermore, it presents a novel bag of tricks-based deep learning approach “Bot-Net” which leverages a single layer long-short-term memory network regularized through DropConnect to capture higher order residue dependencies, pooling to retain most salient features, normalization to prevent exploding and vanishing gradient issues, learning rate decay, and dropout to regularize precise neural network for lncRNA–miRNA interaction prediction.ResultsBoT-Net outperforms the state-of-the-art lncRNA–miRNA interaction prediction approach by 2%, 8%, and 4% in terms of accuracy, specificity, and matthews correlation coefficient. Furthermore, a case study analysis indicates that BoT-Net also outperforms state-of-the-art lncRNA–protein interaction predictor on a benchmark dataset by accuracy of 10%, sensitivity of 19%, specificity of 6%, precision of 14%, and matthews correlation coefficient of 26%.ConclusionIn the benchmark lncRNA–miRNA interaction prediction dataset, the length of the lncRNA sequence varies from 213 residues to 22,743 residues and in the benchmark lncRNA–protein interaction prediction dataset, lncRNA sequences vary from 15 residues to 1504 residues. For such highly flexible length sequences, fixed length generation using copy padding introduces a significant level of bias which makes a large number of lncRNA sequences very much identical to each other and eventually derail classifier generalizeability. Empirical evaluation reveals that within 50 residues of only the starting region of long lncRNA sequences, a highly informative distribution for lncRNA–miRNA interaction prediction is contained, a crucial finding exploited by the proposed BoT-Net approach to optimize the lncRNA fixed length generation process.Availability:BoT-Net web server can be accessed at https://sds_genetic_analysis.opendfki.de/lncmiRNA/.Graphic Abstract

Journal Article

Share this book

Add to My Shelf

doepipeline: a systematic approach to optimizing multi-level and multi-step data processing workflows

by Sjödin, Andreas , Svensson, Daniel , Sundell, David in Algorithms , Analysis , Assembly

2019

Background Selecting the proper parameter settings for bioinformatic software tools is challenging. Not only will each parameter have an individual effect on the outcome, but there are also potential interaction effects between parameters. Both of these effects may be difficult to predict. To make the situation even more complex, multiple tools may be run in a sequential pipeline where the final output depends on the parameter configuration for each tool in the pipeline. Because of the complexity and difficulty of predicting outcomes, in practice parameters are often left at default settings or set based on personal or peer experience obtained in a trial and error fashion. To allow for the reliable and efficient selection of parameters for bioinformatic pipelines, a systematic approach is needed. Results We present doepipeline , a novel approach to optimizing bioinformatic software parameters, based on core concepts of the Design of Experiments methodology and recent advances in subset designs. Optimal parameter settings are first approximated in a screening phase using a subset design that efficiently spans the entire search space, then optimized in the subsequent phase using response surface designs and OLS modeling. Doepipeline was used to optimize parameters in four use cases; 1) de-novo assembly, 2) scaffolding of a fragmented genome assembly, 3) k-mer taxonomic classification of Oxford Nanopore Technologies MinION reads, and 4) genetic variant calling. In all four cases, doepipeline found parameter settings that produced a better outcome with respect to the characteristic measured when compared to using default values. Our approach is implemented and available in the Python package doepipeline . Conclusions Our proposed methodology provides a systematic and robust framework for optimizing software parameter settings, in contrast to labor- and time-intensive manual parameter tweaking. Implementation in doepipeline makes our methodology accessible and user-friendly, and allows for automatic optimization of tools in a wide range of cases. The source code of doepipeline is available at https://github.com/clicumu/doepipeline and it can be installed through conda-forge.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter