Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

by Van Horn, John D. , Heavner, Ben , Hood, Leroy , Kesselman, Carl , Tang, Ming , Clark, Kristi , Foster, Ian , Dauer, William , Glusman, Gustavo , Hampstead, Benjamin M. , Toga, Arthur W. , Deutsch, Eric W. , Pa, Judy , Madduri, Ravi , Spino, Cathie , Price, Nathan D. , Chard, Kyle , Darcy, Mike , Ames, Joseph , Dinov, Ivo D.

in Aged / Algorithms / Alzheimer's disease / Alzheimers disease / Amyotrophic lateral sclerosis / Analysis / Analytics / Artificial intelligence / Big Data / Biology / Biology and Life Sciences / Biomarkers / Cerebellum / Classification / Complexity / Computer and Information Sciences / Data analysis / Data management / Data processing / Databases, Factual / Datasets / Demographics / Diagnosis / Diagnostic systems / Disease control / Disease Progression / Female / Forecasting / Genetics / Health risks / Heterogeneity / Humans / Informatics / Information management / Information science / Laboratories / Learning algorithms / Logistic Models / Machine learning / Male / Mathematical models / Medical diagnosis / Medical imaging / Medicine and Health Sciences / Model accuracy / Movement disorders / Neural networks / Neurodegeneration / Neurodegenerative diseases / Neuroimaging / Neurology / NMR / Nuclear magnetic resonance / Nursing schools / Parkinson disease / Parkinson Disease - diagnosis / Parkinson Disease - genetics / Parkinson Disease - pathology / Parkinson's disease / Parkinsons disease / People and Places / Physical Sciences / Predictions / Principal components analysis / Proteins / Research and Analysis Methods / Science / Science & Technology - Other Topics / Statistics / Support Vector Machine / Support vector machines / Trauma

2016

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

2016

Confirm

Do you wish to request the book?

Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

2016

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Journal Article

Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

Van Horn, John D.,

Heavner, Ben,

Hood, Leroy,

Kesselman, Carl,

Tang, Ming,

Clark, Kristi,

Foster, Ian,

Dauer, William,

Glusman, Gustavo,

Hampstead, Benjamin M.,

Toga, Arthur W.,

Deutsch, Eric W.,

Pa, Judy,

Madduri, Ravi,

Spino, Cathie,

Price, Nathan D.,

Chard, Kyle,

Darcy, Mike,

Ames, Joseph,

Dinov, Ivo D.

2016

Overview

A unique archive of Big Data on Parkinson's Disease is collected, managed and disseminated by the Parkinson's Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson's disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data-large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources-all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson's disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson's disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer's, Huntington's, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications.

Share this book

Add to My Shelf

Publisher

Public Library of Science,Public Library of Science (PLoS)

Subject

Aged

/ Algorithms

/ Alzheimer's disease

/ Alzheimers disease

/ Amyotrophic lateral sclerosis

/ Analysis

/ Analytics