Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Harmonizing and integrating the NCI Genomic Data Commons through accessible, interactive, and cloud-enabled workflows
by
Yeung, Ka Yee
, Hoang, Varik
, Schmitz, Robert
, Lloyd, Wes
, Ling-Hong, Hung
, Fukuda, Bryce
in
Bioinformatics
/ Datasets
/ DNA sequencing
/ Genomics
/ Genotypes
/ mRNA
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Harmonizing and integrating the NCI Genomic Data Commons through accessible, interactive, and cloud-enabled workflows
by
Yeung, Ka Yee
, Hoang, Varik
, Schmitz, Robert
, Lloyd, Wes
, Ling-Hong, Hung
, Fukuda, Bryce
in
Bioinformatics
/ Datasets
/ DNA sequencing
/ Genomics
/ Genotypes
/ mRNA
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Harmonizing and integrating the NCI Genomic Data Commons through accessible, interactive, and cloud-enabled workflows
Paper
Harmonizing and integrating the NCI Genomic Data Commons through accessible, interactive, and cloud-enabled workflows
2024
Request Book From Autostore
and Choose the Collection Method
Overview
Cancer data is widely available in repositories such as the National Cancer Institute (NCI) Genomic Data Commons (GDC). These datasets could serve as controls or comparisons in compendium analyses with user data, avoiding the expense and time of generating additional datasets. However, the user must be able to process their new data in the same manner for these comparisons to be useful. This can be non-trivial. Although the executables themselves are usually available in repositories, the GDC pipelines that describe that entire analysis workflow are currently published as text-based standard operating procedures (SOPs). It is difficult to document a computational workflow to the level of detail and accuracy required to reproduce the results. Discrepancies between versions and exclusions of details accumulate as the documentation inevitably lags behind code revisions. We address this problem by converting the SOPs into a downloadable and executable format. Specifically, we converted the GDC DNA sequencing (DNA-Seq) and the GDC mRNA sequencing (mRNA-Seq) SOPs into reproducible, self-installing, containerized, and interactive graphical workflows. These can be applied to reproducibly process user data and to harmonize datasets across repositories. Using our publicly available graphical workflows, we harmonize raw RNA-Seq datasets from the GDC and the Genotype-Tissue Expression (GTEx) project that were originally processed using different methodologies to illustrate the importance of uniform processing of control and treatment data for accurate inference of differentially expressed genes. By disseminating the analytical methodology in a reproducible and easily executed form, we greatly increase the utility of the GDC by enabling researchers to uniformly process custom data and datasets across multiple repositories to enhance data interpretation. Our approach and open-source executable workflows of making the analytical process as readily available as the data can be applied to other data repositories to increase their impact on scientific research.Competing Interest StatementLHH and KYY have equity interest in Biodepot LLC, which receives compensation from NCI SBIR contract numbers 75N91020C00009 and 75N91021C00022. The terms of this arrangement have been reviewed and approved by the University of Washington in accordance with its policies governing outside work and financial conflicts of interest in research.Footnotes* In this revision, we updated the content to reflect the latest data releases from the NCI Genomic Data Commons. We also made our contributions in this work clearer by revising the title, abstract and introduction. In addition, we re-tested our workflows, cleaned up the GitHub repository, added documentation, and include only the workflows that work.
Publisher
Cold Spring Harbor Laboratory Press,Cold Spring Harbor Laboratory
Subject
This website uses cookies to ensure you get the best experience on our website.