Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
9,026
result(s) for
"analysis workflow"
Sort by:
ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data
2020
Background
Metagenomics studies provide valuable insight into the composition and function of microbial populations from diverse environments; however, the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes. Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools, linking them with advanced binning and annotation tools, and maintaining provenance of the processing continues to be challenging for researchers.
Results
Here we present ATLAS, a software package for customizable data processing from raw sequence reads to functional and taxonomic annotations using state-of-the-art tools to assemble, annotate, quantify, and bin metagenome data. Abundance estimates at genome resolution are provided for each sample in a dataset. ATLAS is written in Python and the workflow implemented in Snakemake; it operates in a Linux environment, and is compatible with Python 3.5+ and Anaconda 3+ versions. The source code for ATLAS is freely available, distributed under a BSD-3 license.
Conclusions
ATLAS provides a user-friendly, modular and customizable Snakemake workflow for metagenome data processing; it is easily installable with conda and maintained as open-source on GitHub at
https://github.com/metagenome-atlas/atlas
.
Journal Article
Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants version 1; peer review: 2 approved
by
Garcia, Maxime
,
Juhos, Szilveszter
,
Wirta, Valtteri
in
access to information
,
Analysis workflow
,
bioinformatics
2020
Whole-genome sequencing (WGS) is a fundamental technology for research to advance precision medicine, but the limited availability of portable and user-friendly workflows for WGS analyses poses a major challenge for many research groups and hampers scientific progress. Here we present Sarek, an open-source workflow to detect germline variants and somatic mutations based on sequencing data from WGS, whole-exome sequencing (WES), or gene panels. Sarek features (i) easy installation, (ii) robust portability across different computer environments, (iii) comprehensive documentation, (iv) transparent and easy-to-read code, and (v) extensive quality metrics reporting. Sarek is implemented in the Nextflow workflow language and supports both Docker and Singularity containers as well as Conda environments, making it ideal for easy deployment on any POSIX-compatible computers and cloud compute environments. Sarek follows the GATK best-practice recommendations for read alignment and pre-processing, and includes a wide range of software for the identification and annotation of germline and somatic single-nucleotide variants, insertion and deletion variants, structural variants, tumour sample purity, and variations in ploidy and copy number. Sarek offers easy, efficient, and reproducible WGS analyses, and can readily be used both as a production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups. The Sarek source code, documentation and installation instructions are freely available at
https://github.com/nf-core/sarek and at
https://nf-co.re/sarek/.
Journal Article
systemPipeR: NGS workflow and report generation environment
by
Backman, Tyler W. H.
,
Girke, Thomas
in
Algorithms
,
Bioinformatics
,
Biomedical and Life Sciences
2016
Background
Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R/Bioconductor environment into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner.
Results
To address this need, we have developed the R/Bioconductor package
systemPipeR
. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Additional workflow templates will be provided in the future.
Conclusions
systemPipeR
accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R/Bioconductor and command-line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments.
systemPipeR
is freely available for all common operating systems from Bioconductor (
http://bioconductor.org/packages/devel/systemPipeR
).
Journal Article
Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite
by
Cassan, Océane
,
Martin, Antoine
,
Lèbre, Sophie
in
Analysis
,
Analysis workflow
,
Animal Genetics and Genomics
2021
Background
High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies.
Results
We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses.
Conclusions
We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service (
https://diane.bpmp.inrae.fr
), or can be installed and locally launched as a complete R package.
Journal Article
Workflow analysis of data science code in public GitHub repositories
by
Sarasua, Cristina
,
Bernstein, Abraham
,
Ramasamy, Dhivyabharathi
in
Coding
,
Data analysis
,
Data science
2023
Despite the ubiquity of data science, we are far from rigorously understanding how coding in data science is performed. Even though the scientific literature has hinted at the iterative and explorative nature of data science coding, we need further empirical evidence to understand this practice and its workflows in detail. Such understanding is critical to recognise the needs of data scientists and, for instance, inform tooling support. To obtain a deeper understanding of the iterative and explorative nature of data science coding, we analysed 470 Jupyter notebooks publicly available in GitHub repositories. We focused on the extent to which data scientists transition between different types of data science activities, or steps (such as data preprocessing and modelling), as well as the frequency and co-occurrence of such transitions. For our analysis, we developed a dataset with the help of five data science experts, who manually annotated the data science steps for each code cell within the aforementioned 470 notebooks. Using the first-order Markov chain model, we extracted the transitions and analysed the transition probabilities between the different steps. In addition to providing deeper insights into the implementation practices of data science coding, our results provide evidence that the steps in a data science workflow are indeed iterative and reveal specific patterns. We also evaluated the use of the annotated dataset to train machine-learning classifiers to predict the data science step(s) of a given code cell. We investigate the representativeness of the classification by comparing the workflow analysis applied to (a) the predicted data set and (b) the data set labelled by experts, finding an F1-score of about 71% for the 10-class data science step prediction problem.
Journal Article
wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
2020
Background
Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furthermore, previous pipelines lack features or show technical deficiencies, thus impeding analyses.
Results
We developed wg-blimp (
w
hole
g
enome
b
isu
l
f
i
te sequencing
m
ethylation analysis
p
ipeline) as an end-to-end pipeline to ease whole genome bisulfite sequencing data analysis. It integrates established algorithms for alignment, quality control, methylation calling, detection of differentially methylated regions, and methylome segmentation, requiring only a reference genome and raw sequencing data as input. Comparing wg-blimp to previous end-to-end pipelines reveals similar setups for common sequence processing tasks, but shows differences for post-alignment analyses. We improve on previous pipelines by providing a more comprehensive analysis workflow as well as an interactive user interface. To demonstrate wg-blimp’s ability to produce correct results we used it to call differentially methylated regions for two publicly available datasets. We were able to replicate 112 of 114 previously published regions, and found results to be consistent with previous findings. We further applied wg-blimp to a publicly available sample of embryonic stem cells to showcase methylome segmentation. As expected, unmethylated regions were in close proximity of transcription start sites. Segmentation results were consistent with previous analyses, despite different reference genomes and sequencing techniques.
Conclusions
wg-blimp provides a comprehensive analysis pipeline for whole genome bisulfite sequencing data as well as a user interface for simplified result inspection. We demonstrated its applicability by analysing multiple publicly available datasets. Thus, wg-blimp is a relevant alternative to previous analysis pipelines and may facilitate future epigenetic research.
Journal Article
Sensor-based machine learning for workflow detection and as key to detect expert level in laparoscopic suturing and knot-tying
by
Garrow, Carly R
,
Schmidt, Mona W
,
Karl-Friedrich Kowalewski
in
Accuracy
,
Algorithms
,
Artificial intelligence
2019
IntroductionThe most common way of assessing surgical performance is by expert raters to view a surgical task and rate a trainee’s performance. However, there is huge potential for automated skill assessment and workflow analysis using modern technology. The aim of the present study was to evaluate machine learning (ML) algorithms using the data of a Myo armband as a sensor device for skills level assessment and phase detection in laparoscopic training.Materials and methodsParticipants of three experience levels in laparoscopy performed a suturing and knot tying task on silicon models. Experts rated performance using Objective Structured Assessment of Surgical Skills (OSATS). Participants wore Myo armbands (Thalmic Labs™, Ontario, Canada) to record acceleration, angular velocity, orientation, and Euler orientation. ML algorithms (decision forest, neural networks, boosted decision tree) were compared for skill level assessment and phase detection.Results28 participants (8 beginner, 10 intermediate, 10 expert) were included, and 99 knots were available for analysis. A neural network regression model had the lowest mean absolute error in predicting OSATS score (3.7 ± 0.6 points, r2 = 0.03 ± 0.81; OSATS min.-max.: 4–37 points). An ensemble of binary-class neural networks yielded the highest accuracy in predicting skill level (beginners: 82.2% correctly identified, intermediate: 3.0%, experts: 79.5%) whereas standard statistical analysis failed to discriminate between skill levels. Phase detection on raw data showed the best results with a multi-class decision jungle (average 16% correctly identified), but improved to 43% average accuracy with two-class boosted decision trees after Dynamic time warping (DTW) application.ConclusionModern machine learning algorithms aid in interpreting complex surgical motion data, even when standard analysis fails. Dynamic time warping offers the potential to process and compare surgical motion data in order to allow automated surgical workflow detection. However, further research is needed to interpret and standardize available data and improve sensor accuracy.
Journal Article
A roadmap for implementation of kV‐CBCT online adaptive radiation therapy and initial first year experiences
by
Belliveau, Jean‐Guy
,
Fiveash, John A.
,
Marcrom, Samuel R.
in
commissioning
,
Dose Fractionation, Radiation
,
Ethos
2023
Purpose Online Adaptive Radiation Therapy (oART) follows a different treatment paradigm than conventional radiotherapy, and because of this, the resources, implementation, and workflows needed are unique. The purpose of this report is to outline our institution's experience establishing, organizing, and implementing an oART program using the Ethos therapy system. Methods We include resources used, operational models utilized, program creation timelines, and our institutional experiences with the implementation and operation of an oART program. Additionally, we provide a detailed summary of our first year's clinical experience where we delivered over 1000 daily adaptive fractions. For all treatments, the different stages of online adaption, primary patient set‐up, initial kV‐CBCT acquisition, contouring review and edit of influencer structures, target review and edits, plan evaluation and selection, Mobius3D 2nd check and adaptive QA, 2nd kV‐CBCT for positional verification, treatment delivery, and patient leaving the room, were analyzed. Results We retrospectively analyzed data from 97 patients treated from August 2021–August 2022. One thousand six hundred seventy seven individual fractions were treated and analyzed, 632(38%) were non‐adaptive and 1045(62%) were adaptive. Seventy four of the 97 patients (76%) were treated with standard fractionation and 23 (24%) received stereotactic treatments. For the adaptive treatments, the generated adaptive plan was selected in 92% of treatments. On average(±std), adaptive sessions took 34.52 ± 11.42 min from start to finish. The entire adaptive process (from start of contour generation to verification CBCT), performed by the physicist (and physician on select days), was 19.84 ± 8.21 min. Conclusion We present our institution's experience commissioning an oART program using the Ethos therapy system. It took us 12 months from project inception to the treatment of our first patient and 12 months to treat 1000 adaptive fractions. Retrospective analysis of delivered fractions showed that the average overall treatment time was approximately 35 min and the average time for the adaptive component of treatment was approximately 20 min.
Journal Article
Deep learning for surgical workflow analysis: a survey of progresses, limitations, and trends
2024
Automatic surgical workflow analysis, which aims to recognize the ongoing surgical events in videos, is fundamental for developing context-aware computer-assisted systems. This paper reviews representative surgical workflow recognition algorithms based on deep learning, outlining their merits, limitations, and future research directions. The literature survey was performed on three large bibliographic databases, covering 67 lary sources, which were comparatively analyzed in terms of spatial feature modeling, spatio-temporal feature modeling, input pre-processing, regularization and post-processing algorithms, as well as learning strategies. Then, common public datasets and evaluation metrics for surgical workflow recognition are also described in detail. Finally, we discuss all literature from different perspectives, and point out the challenges, possible solutions and future trends. The need for more diverse and larger datasets, the potential of unsupervised and semi-supervised learning approaches, comprehensive and equitable metrics, establishing complete regulatory and data standards, and interoperability will be key challenges in translating models to clinical operating rooms. And we propose that surgical activity anticipation and employing large language model as training assistant are interesting research directions in surgical workflow analysis.
Journal Article