Catalogue Search | MBRL

Master data management and data governance

by Berson, Alex , Dubov, Lawrence in Customer relations Data processing. , Data warehousing. , Data integration (Computer science)

Book

Share this book

Add to My Shelf

A practical guide for combining data to model species distributions

by Fletcher, Robert J. , Robertson, Ellen P. , Hefley, Trevor J. in Animals , Bias , Biological evolution

2019

Understanding and accurately modeling species distributions lies at the heart of many problems in ecology, evolution, and conservation. Multiple sources of data are increasingly available for modeling species distributions, such as data from citizen science programs, atlases, museums, and planned surveys. Yet reliably combining data sources can be challenging because data sources can vary considerably in their design, gradients covered, and potential sampling biases. We review, synthesize, and illustrate recent developments in combining multiple sources of data for species distribution modeling. We identify five ways in which multiple sources of data are typically combined for modeling species distributions. These approaches vary in their ability to accommodate sampling design, bias, and uncertainty when quantifying environmental relationships in species distribution models. Many of the challenges for combining data are solved through the prudent use of integrated species distribution models: models that simultaneously combine different data sources on species locations to quantify environmental relationships for explaining species distribution. We illustrate these approaches using planned survey data on 24 species of birds coupled with opportunistically collected eBird data in the southeastern United States. This example illustrates some of the benefits of data integration, such as increased precision in environmental relationships, greater predictive accuracy, and accounting for sample bias. Yet it also illustrates challenges of combining data sources with vastly different sampling methodologies and amounts of data. We provide one solution to this challenge through the use of weighted joint likelihoods. Weighted joint likelihoods provide a means to emphasize data sources based on different criteria (e.g., sample size), and we find that weighting improves predictions for all species considered. We conclude by providing practical guidance on combining multiple sources of data for modeling species distributions.

Journal Article

Share this book

Add to My Shelf

Oracle data integration : tools for harnessing data

by Malcher, Michelle, author , Curtis, Bobby L., author in Oracle (Computer file) , Data integration (Computer science) , Data mining.

\"Deliver continuous access to timley and accurate BI across your enterprise using the detailed information in this Oracle Press guide. Through clear explainations and practical examples, a team of Oracle experts shows how to assimilate data from disparate sources into a single, unifed view. Find out how to transform data in real time, handle replication and migration, and deploy Oracle Data Integrator and Oracle GoldenGate. Oracle Data Integration : Tools for Harnessing Data offers complete coverage of the latest Big Data hardware and software solutions\"--Back cover.

Book

Share this book

Add to My Shelf

Resolving misaligned spatial data with integrated species distribution models

by Reich, Brian J. , Miller, David A. W. , Pacifici, Krishna in Bias , black‐throated blue warbler , change of support

2019

Advances in species distribution modeling continue to be driven by a need to predict species responses to environmental change coupled with increasing data availability. Recent work has focused on development of methods that integrate multiple streams of data to model species distributions. Combining sources of information increases spatial coverage and can improve accuracy in estimates of species distributions. However, when fusing multiple streams of data, the temporal and spatial resolutions of data sources may be mismatched. This occurs when data sources have fluctuating geographic coverage, varying spatial scales and resolutions, and differing sources of bias and sparsity. It is well documented in the spatial statistics literature that ignoring the misalignment of different data sources will result in bias in both the point estimates and uncertainty. This will ultimately lead to inaccurate predictions of species distributions. Here, we examine the issue of misaligned data as it relates specifically to integrated species distribution models. We then provide a general solution that builds off work in the statistical literature for the change-of-support problem. Specifically, we leverage spatial correlation and repeat observations at multiple scales to make statistically valid predictions at the ecologically relevant scale of inference. An added feature of the approach is that addressing differences in spatial resolution between data sets can allow for the evaluation and calibration of lesser-quality sources in many instances. Using both simulations and data examples, we highlight the utility of this modeling approach and the consequences of not reconciling misaligned spatial data. We conclude with a brief discussion of the upcoming challenges and obstacles for species distribution modeling via data fusion.

Journal Article

Share this book

Add to My Shelf

Network medicine : complex systems in human disease and therapeutics

by Loscalzo, Joseph, editor , Barabâasi, Albert-Lâaszlâo, editor , Silverman, Edwin K., editor in Medical informatics. , Data integration (Computer science) , Diseases Causes and theories of causation Data processing.

\"Network medicine, a new field which developed from the application of systems biology approaches to human disease, embraces the complexity of multifactorial influences on disease, which can be driven by non-linear effects and molecular and statistical interactions.The development of comprehensive and affordable Omics platforms provides the data types for network medicine, and graph theory and statistical physics provide the theoretical framework to analyze networks. While network medicine provides a fundamentally different approach to understanding disease etiology, it will also lead to key differences in how diseases are treated--with multiple molecular targets that may require manipulation in a coordinated, dynamic fashion. Much remains to be learned regarding the optimal approaches to integrate different Omics data types and to perform network analyses; this book provides an overview of the progress that has been made and the challenges that remain.\"-- Provided by publisher

Book

Share this book

Add to My Shelf

Disentangling data discrepancies with integrated population models

by Saunders, Sarah P. , Zipkin, Elise F. , Rossman, Sam in American Woodcock , Animals , Animals, Wild

2019

A common challenge for studying wildlife populations occurs when different survey methods provide inconsistent or incomplete inference on the trend, dynamics, or viability of a population. A potential solution to the challenge of conflicting or piecemeal data relies on the integration of multiple data types into a unified modeling framework, such as integrated population models (IPMs). IPMs are a powerful approach for species that inhabit spatially and seasonally complex environments. We provide guidance on exploiting the capabilities of IPMs to address inferential discrepancies that stem from spatiotemporal data mismatches. We illustrate this issue with analysis of a migratory species, the American Woodcock (Scolopax minor), in which individual monitoring programs suggest differing population trends. To address this discrepancy, we synthesized several long-term data sets (1963–2015) within an IPM to estimate continental-scale population trends, and link dynamic drivers across the full annual cycle and complete extent of the woodcock’s geographic range in eastern North America. Our analysis reveals the limiting portions of the life cycle by identifying time periods and regions where vital rates are lowest and most variable, as well as which demographic parameters constitute the main drivers of population change. We conclude by providing recommendations for resolving conflicting population estimates within an integrated modeling approach, and discuss how strategies (e.g., data thinning, expert opinion elicitation) from other disciplines could be incorporated into ecological analyses when attempting to combine multiple, incongruent data types.

Journal Article

Share this book

Add to My Shelf

In-memory data management : an inflection point for enterpise applications

by Plattner, Hasso, 1944- , Zeier, Alexander, 1969- in Database management. , Enterprise application integration (Computer systems) , Business Data processing.

Book

Share this book

Add to My Shelf

Integrating omics datasets with the OmicsPLS package

by Houwing-Duistermaat, Jeanine , Kiełbasa, Szymon M. , Uh, Hae-Won in Algorithms , Analysis , Bioinformatics

2018

Background With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. Results We introduce OmicsPLS , an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. Conclusions We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages(“OmicsPLS”) .

Journal Article

Share this book

Add to My Shelf

Analytics in healthcare : an introduction

by Gensinger, Raymond A., Jr., editor , Healthcare Information and Management Systems Society, issuing body in Health services administration Data processing. , Delivery of Health Care organization & administration. , Information Storage and Retrieval methods.

Book

Share this book

Add to My Shelf

The Role of Snowmelt and Subsurface Heterogeneity in Headwater Hydrology of a Mountainous Catchment in Colorado: A Model‐Data Integration Approach

by Ulrich, Craig , Uhlemann, Sebastian , Dafflon, Baptiste in Base flow , Coniferous forests , Creeks & streams

2025

Mountainous headwater streams are sustained by both snowmelt‐driven streamflow and groundwater discharge in the Upper Colorado River Basin. However, predicting headwater stream discharge magnitude and peak flow timing is challenging in mountainous terrains, where snowmelt rates vary with vegetation type and elevation, and heterogeneous subsurface physical properties influence groundwater storage and its release. We used a model‐data integration approach to investigate the roles of snowmelt and subsurface structure in stream discharge and groundwater level. We ran an ensemble of 100 integrated surface‐subsurface hydrologic models for a mountainous headwater catchment near Crested Butte, Colorado, USA. We also evaluated and calibrated these models against observed data sets, including snow depth measurements using distributed temperature probes, stream discharge, and groundwater levels. Calibration with multiple data sources using neural density estimators has further constrained uncertainty in subsurface properties and snowmelt rates. Results indicated that observed slower snowmelt rates in evergreen forests delayed the peak flow and baseflow onset. In upstream areas with lower subsurface permeability, water was stored within the subsurface but was not released as interflow or shallow groundwater flow, and thereby not contributing to downstream streamflow during recession limb periods. Double peaks in groundwater occurred in areas with spatial subsurface heterogeneity, in our case due to the contrast between granodiorite and Mancos shale. These process‐based insights into groundwater and snowmelt dynamics in mountainous headwaters will help improve predictions of headwater hydrology.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter