Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
15,042
result(s) for
"DATA REQUIREMENTS"
Sort by:
Disruptive analytics : charting your strategy for next-generation business analytics
Learn all you need to know about seven key innovations disrupting business analytics today. These innovations the open source business model, cloud analytics, the Hadoop ecosystem, Spark and in-memory analytics, streaming analytics, Deep Learning, and self-service analytics are radically changing how businesses use data for competitive advantage. Taken together, they are disrupting the business analytics value chain, creating new opportunities. Enterprises who seize the opportunity will thrive and prosper, while others struggle and decline: disrupt or be disrupted. Disruptive Business Analytics provides strategies to profit from disruption. It shows you how to organize for insight, build and provision an open source stack, how to practice lean data warehousing, and how to assimilate disruptive innovations into an organization. Through a short history of business analytics and a detailed survey of products and services, analytics authority Thomas W. Dinsmore provides a practical explanation of the most compelling innovations available today. -- Provided by publisher.
Identifying and managing data quality requirements: a design science study in the field of automated driving
by
Knauss, Eric
,
Pradhan, Shameer Kumar
,
Heyn, Hans-Martin
in
Advanced driver assistance systems
,
Autonomous vehicles
,
Data
2024
Good data quality is crucial for any data-driven system’s effective and safe operation. For critical safety systems, the significance of data quality is even higher since incorrect or low-quality data may cause fatal faults. However, there are challenges in identifying and managing data quality. In particular, there is no accepted process to define and continuously test data quality concerning what is necessary for operating the system. This lack is problematic because even safety-critical systems become increasingly dependent on data. Here, we propose a Candidate Framework for Data Quality Assessment and Maintenance (CaFDaQAM) to systematically manage data quality and related requirements based on design science research. The framework is constructed based on an advanced driver assistance system (ADAS) case study. The study is based on empirical data from a literature review, focus groups, and design workshops. The proposed framework consists of four components: a Data Quality Workflow, a List of Data Quality Challenges, a List of Data Quality Attributes, and Solution Candidates. Together, the components act as tools for data quality assessment and maintenance. The candidate framework and its components were validated in a focus group.
Journal Article
Evaluating the data quality of iNaturalist termite records
2020
Citizen science (CS) contributes to the knowledge about species distributions, which is a critical foundation in the studies of invasive species, biological conservation, and response to climatic change. In this study, we assessed the value of CS for termites worldwide. First, we compared the abundance and species diversity of geo-tagged termite records in iNaturalist to that of the University of Florida termite collection (UFTC) and the Global Biodiversity Information Facility (GBIF). Second, we quantified how the combination of these data sources affected the number of genera that satisfy data requirements for ecological niche modeling. Third, we assessed the taxonomic correctness of iNaturalist termite records in the Americas at the genus and family level through expert review based on photo identification. Results showed that iNaturalist records were less abundant than those in the UFTC and in GBIF, although they complemented the latter two in selected world regions. A combination of GBIF and the UFTC led to a significant increase in the number of termite genera satisfying the abundance criterion for niche modeling compared to either of those two sources alone, whereas adding iNaturalist observations as a third source only had a moderate effect on the number of termite genera satisfying that criterion. Although research grade observations in iNaturalist require a community-supported and agreed upon identification (ID) below the family taxonomic rank, our results indicated that iNaturalist data do not exhibit a higher taxonomic classification accuracy when they are designated research grade. This means that non-research grade observations can be used to more completely map the presence of termite locations in certain geographic locations without significantly jeopardizing data quality. We concluded that CS termite observation records can, to some extent, complement expert termite collections in terms of geographic coverage and species diversity. Based on recent data contribution patterns in CS data, the role of CS termite contributions is expected to grow significantly in the near future.
Journal Article
Recommendations on benchmarks for numerical air quality model applications in China – Part 1: PM2.5 and chemical species
2021
Numerical air quality models (AQMs) have been applied more frequently over the past decade to address diverse scientific and regulatory issues associated with deteriorated air quality in China. Thorough evaluation of a model's ability to replicate monitored conditions (i.e., a model performance evaluation or MPE) helps to illuminate the robustness and reliability of the baseline modeling results and subsequent analyses. However, with numerous input data requirements, diverse model configurations, and the scientific evolution of the models themselves, no two AQM applications are the same and their performance results should be expected to differ. MPE procedures have been developed for Europe and North America, but there is currently no uniform set of MPE procedures and associated benchmarks for China. Here we present an extensive review of model performance for fine particulate matter (PM2.5) AQM applications to China and, from this context, propose a set of statistical benchmarks that can be used to objectively evaluate model performance for PM2.5 AQM applications in China. We compiled MPE results from 307 peer-reviewed articles published between 2006 and 2019, which applied five of the most frequently used AQMs in China. We analyze influences on the range of reported statistics from different model configurations, including modeling regions and seasons, spatial resolution of modeling grids, temporal resolution of the MPE, etc. Analysis using a random forest method shows that the choices of emission inventory, grid resolution, and aerosol- and gas-phase chemistry are the top three factors affecting model performance for PM2.5. We propose benchmarks for six frequently used evaluation metrics for AQM applications in China, including two tiers – “goals” and “criteria” – where goals represent the best model performance that a model is currently expected to achieve and criteria represent the model performance that the majority of studies can meet. Our results formed a benchmark framework for the modeling performance of PM2.5 and its chemical species in China. For instance, in order to meet the goal and criteria, the normalized mean bias (NMB) for total PM2.5 should be within 10 % and 20 %, while the normalized mean error (NME) should be within 35 % and 45 %, respectively. The goal and criteria values of correlation coefficients for evaluating hourly and daily PM2.5 are 0.70 and 0.60, respectively; corresponding values are higher when the index of agreement (IOA) is used (0.80 for goal and 0.70 for criteria). Results from this study will support the ever-growing modeling community in China by providing a more objective assessment and context for how well their results compare with previous studies and to better demonstrate the credibility and robustness of their AQM applications prior to subsequent regulatory assessments.
Journal Article
The Land Use Model Intercomparison Project (LUMIP) contribution to CMIP6: rationale and experimental design
by
Seneviratne, Sonia I
,
Jones, Chris D
,
Jones, Andrew D
in
20th century
,
Atmospheric models
,
Biogeochemical cycles
2016
Human land-use activities have resulted in large changes to the Earth's surface, with resulting implications for climate. In the future, land-use activities are likely to expand and intensify further to meet growing demands for food, fiber, and energy. The Land Use Model Intercomparison Project (LUMIP) aims to further advance understanding of the impacts of land-use and land-cover change (LULCC) on climate, specifically addressing the following questions. (1) What are the effects of LULCC on climate and biogeochemical cycling (past-future)? (2) What are the impacts of land management on surface fluxes of carbon, water, and energy, and are there regional land-management strategies with the promise to help mitigate climate change? In addressing these questions, LUMIP will also address a range of more detailed science questions to get at process-level attribution, uncertainty, data requirements, and other related issues in more depth and sophistication than possible in a multi-model context to date. There will be particular focus on the separation and quantification of the effects on climate from LULCC relative to all forcings, separation of biogeochemical from biogeophysical effects of land use, the unique impacts of land-cover change vs. land-management change, modulation of land-use impact on climate by land-atmosphere coupling strength, and the extent to which impacts of enhanced CO2 concentrations on plant photosynthesis are modulated by past and future land use.LUMIP involves three major sets of science activities: (1) development of an updated and expanded historical and future land-use data set, (2) an experimental protocol for specific LUMIP experiments for CMIP6, and (3) definition of metrics and diagnostic protocols that quantify model performance, and related sensitivities, with respect to LULCC. In this paper, we describe LUMIP activity (2), i.e., the LUMIP simulations that will formally be part of CMIP6. These experiments are explicitly designed to be complementary to simulations requested in the CMIP6 DECK and historical simulations and other CMIP6 MIPs including ScenarioMIP, C4MIP, LS3MIP, and DAMIP. LUMIP includes a two-phase experimental design. Phase one features idealized coupled and land-only model simulations designed to advance process-level understanding of LULCC impacts on climate, as well as to quantify model sensitivity to potential land-cover and land-use change. Phase two experiments focus on quantification of the historic impact of land use and the potential for future land management decisions to aid in mitigation of climate change. This paper documents these simulations in detail, explains their rationale, outlines plans for analysis, and describes a new subgrid land-use tile data request for selected variables (reporting model output data separately for primary and secondary land, crops, pasture, and urban land-use types). It is essential that modeling groups participating in LUMIP adhere to the experimental design as closely as possible and clearly report how the model experiments were executed.
Journal Article
Status and future of numerical atmospheric aerosol prediction with a focus on data requirements
by
Benedetti, Angela
,
Laj, Paolo
,
Wiedensohler, Alfred
in
Aerosol chemistry
,
Aerosol observations
,
Aerosol particles
2018
Numerical prediction of aerosol particle properties has become an important activity at many research and operational weather centers. This development is due to growing interest from a diverse set of stakeholders, such as air quality regulatory bodies, aviation and military authorities, solar energy plant managers, climate services providers, and health professionals. Owing to the complexity of atmospheric aerosol processes and their sensitivity to the underlying meteorological conditions, the prediction of aerosol particle concentrations and properties in the numerical weather prediction (NWP) framework faces a number of challenges. The modeling of numerous aerosol-related parameters increases computational expense. Errors in aerosol prediction concern all processes involved in the aerosol life cycle including (a) errors on the source terms (for both anthropogenic and natural emissions), (b) errors directly dependent on the meteorology (e.g., mixing, transport, scavenging by precipitation), and (c) errors related to aerosol chemistry (e.g., nucleation, gas–aerosol partitioning, chemical transformation and growth, hygroscopicity). Finally, there are fundamental uncertainties and significant processing overhead in the diverse observations used for verification and assimilation within these systems. Indeed, a significant component of aerosol forecast development consists in streamlining aerosol-related observations and reducing the most important errors through model development and data assimilation. Aerosol particle observations from satellite- and ground-based platforms have been crucial to guide model development of the recent years and have been made more readily available for model evaluation and assimilation. However, for the sustainability of the aerosol particle prediction activities around the globe, it is crucial that quality aerosol observations continue to be made available from different platforms (space, near surface, and aircraft) and freely shared. This paper reviews current requirements for aerosol observations in the context of the operational activities carried out at various global and regional centers. While some of the requirements are equally applicable to aerosol–climate, the focus here is on global operational prediction of aerosol properties such as mass concentrations and optical parameters. It is also recognized that the term “requirements” is loosely used here given the diversity in global aerosol observing systems and that utilized data are typically not from operational sources. Most operational models are based on bulk schemes that do not predict the size distribution of the aerosol particles. Others are based on a mix of “bin” and bulk schemes with limited capability of simulating the size information. However the next generation of aerosol operational models will output both mass and number density concentration to provide a more complete description of the aerosol population. A brief overview of the state of the art is provided with an introduction on the importance of aerosol prediction activities. The criteria on which the requirements for aerosol observations are based are also outlined. Assimilation and evaluation aspects are discussed from the perspective of the user requirements.
Journal Article
Derived Optimal Linear Combination Evapotranspiration (DOLCE): a global gridded synthesis ET estimate
2018
Accurate global gridded estimates of evapotranspiration (ET) are key to understanding water and energy budgets, in addition to being required for model evaluation. Several gridded ET products have already been developed which differ in their data requirements, the approaches used to derive them and their estimates, yet it is not clear which provides the most reliable estimates. This paper presents a new global ET dataset and associated uncertainty with monthly temporal resolution for 2000–2009. Six existing gridded ET products are combined using a weighting approach trained by observational datasets from 159 FLUXNET sites. The weighting method is based on a technique that provides an analytically optimal linear combination of ET products compared to site data and accounts for both the performance differences and error covariance between the participating ET products. We examine the performance of the weighting approach in several in-sample and out-of-sample tests that confirm that point-based estimates of flux towers provide information on the grid scale of these products. We also provide evidence that the weighted product performs better than its six constituent ET product members in four common metrics. Uncertainty in the ET estimate is derived by rescaling the spread of participating ET products so that their spread reflects the ability of the weighted mean estimate to match flux tower data. While issues in observational data and any common biases in participating ET datasets are limitations to the success of this approach, future datasets can easily be incorporated and enhance the derived product.
Journal Article
Model Inputs and Data Requirements for Process‐Based Stream Temperature Modeling in Regulated Peri‐Alpine Rivers
by
Dorthe, David
,
Pfister, Michael
,
Lane, Stuart N
in
Calibration
,
Climate change
,
Creeks & streams
2025
Regulated rivers can experience sharp temperature variations induced by intermittent hydropower production (thermopeaking). To mitigate ecological impacts, dam operators need to assess the impacts of hydropeaking on stream temperature, and to test scenarios that might reduce them. While stream temperature modeling has been investigated in numerous studies, few have systematically assessed how integrated processes and their representation affect model performance, and models capable of capturing both sub‐hourly variations and long‐term thermal dynamics remain a challenge. Herein, a stream temperature model within the HEC‐RAS platform was used to model the thermal regime of a regulated river in Switzerland, with a 10‐min timestep over the annual time‐scale and for a 22‐km long reach; and for which we had installed a network of stream temperature sensors. While the initial model demonstrated an acceptable performance at the yearly scale (Mean Absolute Error: 0.78–2.10°C and Kling‐Gupta Efficiency: 0.55–0.85), this was not the case at the daily or seasonal time‐scales. Two model corrections were found to be crucial; (a) the correction of potential incoming solar radiation for local shading; and (b) the representation of the heat flux linked to water‐sediment exchanges. With these two corrections, the annual performance improved (MAE: 0.48–0.83°C and KGE: 0.85–0.93) as did the daily and seasonal performance. Although physically based, the model required calibration, underscoring the importance of high‐quality in situ temperature data. The resulting model proves effective for practical applications in hydropower mitigation and river temperature management under complex flow regimes.
Journal Article
Sorting out assortativity: When can we assess the contributions of different population groups to epidemic transmission?
by
Geismar, Cyril
,
Jombart, Thibaut
,
Cori, Anne
in
Biology and life sciences
,
Computer Simulation
,
Control
2024
Characterising the transmission dynamics between various population groups is critical for implementing effective outbreak control measures whilst minimising financial costs and societal disruption. While recent technological and methodological advances have made individual-level transmission chain data increasingly available, it remains unclear how effectively this data can inform group-level transmission patterns, particularly in small, rapidly saturating outbreak settings. We introduce a novel framework that leverages transmission chain data to estimate group transmission assortativity; this quantifies the extent to which individuals transmit within their own group compared to others. Through extensive simulations mimicking nosocomial outbreaks, we assessed the conditions under which our estimator performs effectively and established guidelines for minimal data requirements in small outbreak settings where saturation may occur rapidly. Notably, we demonstrate that detecting and quantifying transmission assortativity is most reliable when at least 30 cases have been observed in each group, before reaching their respective epidemic peaks.
Journal Article
Technical note: Hydrology modelling R packages – a unified analysis of models and practicalities from a user perspective
by
Thirel, Guillaume
,
Brauer, Claudia C.
,
Buytaert, Wouter
in
Archives & records
,
Climate change
,
Comparative analysis
2021
Following the rise of R as a scientific programming language, the increasing requirement for more transferable research and the growth of data availability in hydrology, R packages containing hydrological models are becoming more and more available as an open-source resource to hydrologists. Corresponding to the core of the hydrological studies workflow, their value is increasingly meaningful regarding the reliability of methods and results. Despite package and model distinctiveness, no study has ever provided a comparison of R packages for conceptual rainfall–runoff modelling from a user perspective by contrasting their philosophy, model characteristics and ease of use. We have selected eight packages based on our ability to consistently run their models on simple hydrology modelling examples. We have uniformly analysed the exact structure of seven of the hydrological models integrated into these R packages in terms of conceptual storages and fluxes, spatial discretisation, data requirements and output provided. The analysis showed that very different modelling choices are associated with these packages, which emphasises various hydrological concepts. These specificities are not always sufficiently well explained by the package documentation. Therefore a synthesis of the package functionalities was performed from a user perspective. This synthesis helps to inform the selection of which packages could/should be used depending on the problem at hand. In this regard, the technical features, documentation, R implementations and computational times were investigated. Moreover, by providing a framework for package comparison, this study is a step forward towards supporting more transferable and reusable methods and results for hydrological modelling in R.
Journal Article