Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
2,041 result(s) for "Data-driven modelling"
Sort by:
Fleet‐Based Degradation State Quantification for Industrial Water Electrolyzers
ABSTRACT A reliable and continuous assessment of the degradation state of industrial water electrolyzers is crucial for maintenance planning and dispatch optimization, thus facilitating risk management for both suppliers and operators. Although voltage is a widely used and easily measurable degradation indicator, its effectiveness is compromised in industrial settings due to the impact of arbitrary operating conditions. Existing methods to correct the impact of operating conditions often rely on measuring characteristic curves, which typically only provide a single‐dimensional correction and do not allow varying corrections over time. We propose a data‐driven method for degradation state quantification that adjusts the measured voltage under arbitrary operating conditions to a reference condition, using an empirical voltage model and degradation history from a fleet of electrolyzers. This method involves fitting the empirical voltage model for each time series segment and calculating the voltage under the reference condition. To assist model fitting under limited data coverage, the method utilizes a Bayesian approach to incorporate fleet knowledge–an aggregation of the degradation trajectories of the electrolyzer fleet. This method was validated using both synthetic data and operation data from 12 industrial electrolyzers with 1–3 years of operation history, including in‐depth sensitivity analyses on the data coverage, fleet–target discrepancy, and fleet size. Results proved the superiority of the proposed fleet‐based method over the benchmark method without using fleet knowledge.
Harnessing Data for Flood Prediction: A Systematic Review of Forecasting Techniques and Information Sources
Flood forecasting has undergone significant development over the past two decades, with researchers exploring a range of innovative approaches to quantify both the hazards and risks associated with flood events across a range of magnitudes and frequencies. Recent advances in computing power and data availability have enhanced prediction capabilities, shifting the field towards more sophisticated, data‐driven techniques. This paper reviews these methodologies, tracing their evolution from classical models to modern data‐driven approaches, while highlighting recent progress in data sources and data integration. Our bibliometric analysis indicates that the number of data‐driven flood modelling approaches reported in the literature has increased by approximately fourfold since 2015, underscoring the accelerating trend in flood forecasting research. The focus is on data‐centric techniques, innovations, and ongoing challenges, with special emphasis on unconventional data sources (e.g., commercial microwave links) and their role in flood forecasting. By reviewing traditional and cutting‐edge methods, the discussion identifies underexplored gaps and emerging opportunities within the existing literature, highlighting the trajectory that this fast‐emerging field is following.
Analog Forecasting of Extreme‐Causing Weather Patterns Using Deep Learning
Numerical weather prediction models require ever‐growing computing time and resources but, still, have sometimes difficulties with predicting weather extremes. We introduce a data‐driven framework that is based on analog forecasting (prediction using past similar patterns) and employs a novel deep learning pattern‐recognition technique (capsule neural networks, CapsNets) and an impact‐based autolabeling strategy. Using data from a large‐ensemble fully coupled Earth system model, CapsNets are trained on midtropospheric large‐scale circulation patterns (Z500) labeled 0–4 depending on the existence and geographical region of surface temperature extremes over North America several days ahead. The trained networks predict the occurrence/region of cold or heat waves, only using Z500, with accuracies (recalls) of 69–45% (77–48%) or 62–41% (73–47%) 1–5 days ahead. Using both surface temperature and Z500, accuracies (recalls) with CapsNets increase to ∼80% (88%). In both cases, CapsNets outperform simpler techniques such as convolutional neural networks and logistic regression, and their accuracy is least affected as the size of the training set is reduced. The results show the promises of multivariate data‐driven frameworks for accurate and fast extreme weather predictions, which can potentially augment numerical weather prediction efforts in providing early warnings. Key Points A data‐driven extreme weather prediction framework based on analog forecasting and deep learning pattern‐recognition methods is proposed Extreme surface temperature events over North America are skillfully predicted using only midtropospheric large‐scale circulation patterns More advanced deep learning methods are found to yield better forecasts, encouraging novel methods tailored for climate/weather data
A perspective‐driven and technical evaluation of machine learning in bioreactor scale‐up: A case‐study for potential model developments
Bioreactor scale‐up and scale‐down have always been a topical issue for the biopharmaceutical industry and despite considerable effort, the identification of a fail‐safe strategy for bioprocess development across scales remains a challenge. With the ubiquitous growth of digital transformation technologies, new scaling methods based on computer models may enable more effective scaling. This study aimed to evaluate the potential application of machine learning (ML) algorithms for bioreactor scale‐up, with a specific focus on the prediction of scaling parameters. Factors critical to the development of such models were identified and data for bioreactor scale‐up studies involving CHO cell‐generated mAb products collated from the literature and public sources for the development of unsupervised and supervised ML models. Comparison of bioreactor performance across scales identified similarities between the different processes and primary differences between small‐ and large‐scale bioreactors. A series of three case studies were developed to assess the relationship between cell growth and scale‐sensitive bioreactor features. An embedding layer improved the capability of artificial neural network models to predict cell growth at a large‐scale, as this approach captured similarities between the processes. Further models constructed to predict scaling parameters demonstrated how ML models may be applied to assist the scaling process. The development of data sets that include more characterization data with greater variability under different gassing and agitation regimes will also assist the future development of ML tools for bioreactor scaling. Lay summary: This study examined the potential of machine learning to assist in bioreactor scale‐up. The findings demonstrated the capability of these algorithms to uncover complex non‐linear relationships among scale‐sensitive features, transfer knowledge, and predict process performance across scales. A method for predicting scaling factors for equivalent performance across scales was also developed and the characteristics of ideal datasets for future application of machine learning to scaling described.
Automatic multimode identification of complex industrial processes based on network community detection with manifold similarity
Complex industrial processes usually exhibit multimode characteristics, meaning that statistical features of process data, such as mean, variance, and correlation, vary across different modes. Extracting critical information from these distinct modes can significantly enhance the accuracy and robustness of data‐driven models in process monitoring, condition evaluation, and quality improvement. Consequently, the multimode identification of industrial data becomes a paramount concern in data‐driven modelling. However, existing methods for multimode identification require prior knowledge to predetermine the number of modes and struggle to describe the similarity between high‐dimensional samples effectively. To address this issue, this study introduces an automatic multimode identification method based on complex network community detection. In this approach, each data sample is considered as a node, and manifold similarity is calculated to construct the complex network model. The method leverages weighted geodesic distances to capture the data's manifold structure and potential density, enabling better distinction between high‐dimensional samples in different modes. The greedy search algorithm with modularity maximisation is employed to partition nodes into modes without manual selection of the number of modes. Furthermore, a node degree‐based indicator is developed for online mode monitoring. Experimental studies on two examples demonstrate the effectiveness of the proposed method in uncovering multimode characteristics of complex industrial processes, highlighting its promising application potential. Extracting critical information from different modes can significantly improve the accuracy and robustness of data‐driven models in process monitoring, condition evaluation, and quality improvement. However, the existing multimode identification methods rely on prior knowledge to determine the number of modes in advance and cannot describe the similarity between high‐dimensional samples well. Therefore, a novel mode identification method is proposed based on a complex network to overcome mode number selection's difficulty.
Assessment and deployment of a LSTM-based virtual sensor in an industrial process control loop
Measurement of certain variables within the industrial sector remains a challenge due to the prohibitive costs of sensors, the intricate installation processes, or the continuous nature of production demands. Moreover, if a backup sensor is required in case the main sensor fails, the installation and maintenance difficulties are further increased. A possibility to address this issue is the indirect estimation of the desired variable by leveraging other correlated measures within the operational process. Data-driven techniques are well-suited for this aim, given their capacity to model potentially complex industrial processes. This paper proposes the implementation of a virtual flow sensor for its integration in the control loop of an industrial process. More specifically, four different data-driven methods have been tested to obtain the virtual sensor: multiple linear regression (MLR), multilayer perceptron (MLP), long-short term memory (LSTM) and deep long-short term memory (DeepLSTM). MAE, RMSE and R 2 have been chosen as evaluation metrics for model selection and testing. Furthermore, the robustness of the virtual flow sensor is not only evaluated under ideal operating conditions, but it is also tested under adverse conditions with various noise levels added to the measured signals. Additionally, the performance of the flow control loop using the real and virtual sensors is also evaluated in both ideal and adverse conditions. IAE, ITAE, and IAVU indices are used to assess the control performance. The results prove the robustness of the LSTM-based virtual flow sensor and the effectiveness of the control loop using it, avoiding the modification of the controller and interrupting the process when the real flow sensor fails.
Novel wavelet-LSTM approach for time series prediction
Time series prediction often faces challenges due to hidden patterns and noise within the data. This paper presented a novel algorithm that combines wavelet decomposition with long short-term memory (LSTM) networks, providing a distinct method for handling these challenges. The study considered the monthly rainfall data (mm) of India from January 1901 to December 2021. The series under consideration was denoised using maximum overlap discrete wavelet transform (MODWT), followed by long short-term memory (LSTM) modeling on each denoised series. The hyperband search algorithm was employed to identify the optimal hyperparameter combination for each LSTM model, aiding in further fine-tuning the model. The algorithm ended with the implementing of the inverse wavelet transform on the final predictions. In order to evaluate the efficacy of the proposed approach, it was benchmarked against the other established models such as LSTM, recurrent neural network (RNN), and artificial neural network (ANN). The result showed that the proposed model (MODWT-LSTM) significantly outperformed the other benchmark models like LSTM, RNN, and ANN in terms of forecast accuracy. Specifically, in terms of root-mean-square error (RMSE), the proposed algorithm witnessed a gain in prediction accuracy to the tune of 18.5%, 32.8%, and 36.47% than that of LSTM, RNN, and ANN model, respectively. The superiority of the proposed model is further confirmed by use of Diebold–Mariano (DM) test, establishing the hierarchy of model effectiveness as MODWT-LSTM > LSTM > RNN = ANN in terms of predictive performance.
Sparse Identification of Nonlinear Dynamics‐Based Model Predictive Control for Multirotor Collision Avoidance
This article proposes a data‐driven model predictive control (MPC) method for multirotor collision avoidance, considering uncertainties and the unknown dynamics caused by a payload. To address this challenge, sparse identification of nonlinear dynamics (SINDy) is employed to derive the governing equations of the multirotor system. SINDy is capable of discovering the equations of target systems from limited data, under the assumption that a few dominant functions primarily characterize the system's behavior. In addition, a data collection framework that combines a baseline controller with MPC is proposed to generate diverse trajectories for model identification. A candidate function library, informed by prior knowledge of multirotor dynamics, along with a normalization technique, is utilized to enhance the accuracy of the SINDy‐based model. Using data‐driven model from SINDy, MPC is used to achieve accurate trajectory tracking while satisfying state and input constraints, including those for obstacle avoidance. Simulation results demonstrate that SINDy can successfully identify the governing equations of the multirotor system, accounting for mass parameter uncertainties and aerodynamic effects. Furthermore, the results confirm that the proposed method outperforms conventional MPC, which suffers from parameter uncertainty and an unknown aerodynamic model, in both obstacle avoidance and trajectory tracking performance. Sparse identification of nonlinear dynamics‐based model predictive control is proposed for multirotor collision avoidance.
Parsimonious and Transferrable Parameterization of Reservoir Operations: A Modular Approach for Large‐Scale Modeling
Accurately representing daily reservoir operations in large‐scale hydrological and water resource modeling remains challenging due to both the complex and unclear nature of real‐world operations and very limited availability of operation records for many reservoirs worldwide. To address this gap, this study introduces MODROM (MOdular Data‐driven Reservoir Operation Model), a parsimonious reservoir parameterization scheme that conceptualizes reservoir operations through simple operation modules and their seasonal transitions. These operation modules are designed to be simple and parsimonious for easier generalizing from data‐rich to data‐scarce reservoirs. MODROM is calibrated using high‐quality long‐term operation records from 411 data‐rich reservoirs across the contiguous United States (CONUS), and a Random Forest model is developed to provide calibrated parameters for data‐scarce reservoirs based on a suite of static reservoir characteristics. Results demonstrate MODROM's strong and robust performance when calibrating using all available data for each reservoir, though the performance generally declines for reservoirs with larger regulation capacity. The generalization performance is strong under favorable sampling conditions but varies with different train‐test splits due to the limited reservoir data set. Benchmarking against existing models shows that MODROM achieves enhanced performance, with a median Kling‐Gupta Efficiency of approximately 0.5, compared to 0.4 for the best storage‐based model and 0.2 for data‐inferred model; this demonstrates distinct advantages in generalizing parameters to data‐scarce reservoirs using readily available static reservoir characteristics, though the performance can be affected by train‐test split due to limited reservoirs in the sample. Reservoir operations significantly impact river systems, making it important to accurately represent this human dimension in large‐scale hydrological and water resource models. However, this critical human influence has long been underrepresented due to the complex nature of reservoir management and the lack of operation records. In this study, we developed MODROM (MOdular Data‐driven Reservoir Operation Model), which simplifies reservoir operations into basic operation modules and their seasonal changes. We calibrated this model using extensive operation data for over 400 large reservoirs across the United States and developed a machine learning model to predict operations for reservoirs lacking historical records. Our results demonstrate that MODROM performs well when calibrated with comprehensive operation data, though accuracy decreased for larger reservoirs with more complex regulation capabilities. By testing the model's predictive power on a subset of reservoirs held out as data‐scarce, our approach demonstrated strong potential, though the limited size of our data set introduced some uncertainty. Compared to existing approaches, MODROM provides a significant advantage in predicting operations for data‐scarce reservoirs, making it potentially useful for hydrological and water resource modeling at large scale. A reservoir operation model is developed using simple modules with seasonal transitions for hydrological and water resource modeling A machine learning model is trained to transfer calibrated parameters to data‐scarce reservoirs using static reservoir characteristics The model performs strongly both with calibrated parameters and transferred parameters, though performance varies with train‐test splits
A Comparison of Data‐Driven Approaches to Build Low‐Dimensional Ocean Models
We present a comprehensive inter‐comparison of linear regression (LR), stochastic, and deep‐learning approaches for reduced‐order statistical emulation of ocean circulation. The reference data set is provided by an idealized, eddy‐resolving, double‐gyre ocean circulation model. Our goal is to conduct a systematic and comprehensive assessment and comparison of skill, cost, and complexity of statistical models from the three methodological classes. The model based on LR is considered as a baseline. Additionally, we investigate its additive white noise augmentation and a multi‐level stochastic approach, deep‐learning methods, hybrid frameworks (LR plus deep‐learning), and simple stochastic extensions of deep‐learning and hybrid methods. The assessment metrics considered are: root mean squared error, anomaly cross‐correlation, climatology, variance, frequency map, forecast horizon, and computational cost. We found that the multi‐level linear stochastic approach performs the best for both short‐ and long‐timescale forecasts. The deep‐learning hybrid models augmented by additive state‐dependent white noise came second, while their deterministic counterparts failed to reproduce the characteristic frequencies in climate‐range forecasts. Pure deep learning implementations performed worse than LR and its simple white noise augmentation. Skills of LR and its white noise extension were similar on short timescales, but the latter performed better on long timescales, while LR‐only outputs decay to zero for long simulations. Overall, our analysis promotes multi‐level LR stochastic models with memory effects, and hybrid models with linear dynamical core augmented by additive stochastic terms learned via deep learning, as a more practical, accurate, and cost‐effective option for ocean emulation than pure deep‐learning solutions. Plain Language Summary In weather and climate predictions, scientists use comprehensive ocean circulation models for representing the effects of the oceans on the atmosphere. These models simulate the three‐dimensional ocean dynamics using millions of variables and, thus, require significant computational resources and running time. Therefore, there is a need for low‐cost, data‐driven ocean models with fewer variables that can reproduce essential oceanic circulations with reasonable accuracy. There are several popular data‐driven approaches to build these models, but singling out the best one is difficult and significantly understudied. We have systematically assessed and compared the accuracy, stability, and computational cost of various data‐driven models against the linear regression—a fundamental and easy‐to‐implement deterministic model, that is, it provides a fixed output for a fixed input. We considered several stochastic and deep‐learning models for comparison; stochastic models combine a deterministic model with customized noise, whereas deep‐learning models train a complex network of neurons similar to the human brain. We found that the stochastic models that properly include the core dynamics, time‐delay effects, and model errors perform the best. The core dynamics provides the essential changes, time‐delay effects are the changes due to correlation between successive ocean states, and model errors provide other possible causes of changes. Key Points The multi‐level stochastic approach produces the most stable, accurate, and low‐cost emulator of a double‐gyre ocean model solution Artificial neural networks and long short term memory work better in a hybrid form with linear regression, providing the core dynamics, than in their standalone application Emulators incorporating memory effects and state‐dependent noise show enhanced performance and deep learning can learn these effects