Catalogue Search | MBRL

U-Net Inspired Transformer Architecture for Multivariate Time Series Synthesis

by Jeng, Shyr-Long in Accuracy , Architecture , attention

2025

This study introduces a Multiscale Dual-Attention U-Net (TS-MSDA U-Net) model for long-term time series synthesis. By integrating multiscale temporal feature extraction and dual-attention mechanisms into the U-Net backbone, the model captures complex temporal dependencies more effectively. The model was evaluated in two distinct applications. In the first, using multivariate datasets from 70 real-world electric vehicle (EV) trips, TS-MSDA U-Net achieved a mean absolute error below 1% across key parameters, including battery state of charge, voltage, acceleration, and torque—representing a two-fold improvement over the baseline TS-p2pGAN. While dual-attention modules provided only modest gains over the basic U-Net, the multiscale design enhanced overall performance. In the second application, the model was used to reconstruct high-resolution signals from low-speed analog-to-digital converter data in a prototype resonant CLLC half-bridge converter. TS-MSDA U-Net successfully learned nonlinear mappings and improved signal resolution by a factor of 36, outperforming the basic U-Net, which failed to recover essential waveform details. These results underscore the effectiveness of transformer-inspired U-Net architectures for high-fidelity multivariate time series modeling in both EV analytics and power electronics.

Journal Article

Share this book

Add to My Shelf

PVS-GEN: Systematic Approach for Universal Synthetic Data Generation Involving Parameterization, Verification, and Segmentation

by Kwak, Jong Wook , Kim, Kyung-Min in Artificial intelligence , Automation , Benchmarks

2024

Synthetic data generation addresses the challenges of obtaining extensive empirical datasets, offering benefits such as cost-effectiveness, time efficiency, and robust model development. Nonetheless, synthetic data-generation methodologies still encounter significant difficulties, including a lack of standardized metrics for modeling different data types and comparing generated results. This study introduces PVS-GEN, an automated, general-purpose process for synthetic data generation and verification. The PVS-GEN method parameterizes time-series data with minimal human intervention and verifies model construction using a specific metric derived from extracted parameters. For complex data, the process iteratively segments the empirical dataset until an extracted parameter can reproduce synthetic data that reflects the empirical characteristics, irrespective of the sensor data type. Moreover, we introduce the PoR metric to quantify the quality of the generated data by evaluating its time-series characteristics. Consequently, the proposed method can automatically generate diverse time-series data that covers a wide range of sensor types. We compared PVS-GEN with existing synthetic data-generation methodologies, and PVS-GEN demonstrated a superior performance. It generated data with a similarity of up to 37.1% across multiple data types and by 19.6% on average using the proposed metric, irrespective of the data type.

Journal Article

Share this book

Add to My Shelf

Generative Adversarial Network for Synthesizing Multivariate Time-Series Data in Electric Vehicle Driving Scenarios

by Jeng, Shyr-Long in Accuracy , Architecture , Artificial intelligence

2025

This paper presents a time-series point-to-point generative adversarial network (TS-p2pGAN) for synthesizing realistic electric vehicle (EV) driving data. The model accurately generates four critical operational parameters—battery state of charge (SOC), battery voltage, mechanical acceleration, and vehicle torque—as multivariate time-series data. Evaluation on 70 real-world driving trips from an open battery dataset reveals the model’s exceptional accuracy in estimating SOC values, particularly under complex stop-and-restart scenarios and across diverse initial SOC levels. The model delivers high accuracy, with root mean square error (RMSE), mean absolute error (MAE), and dynamic time warping (DTW) consistently below 3%, 1.5%, and 2.0%, respectively. Qualitative analysis using principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) demonstrates the model’s ability to preserve both feature distributions and temporal dynamics of the original data. This data augmentation framework offers significant potential for advancing EV technology, digital energy management of lithium-ion batteries (LIBs), and autonomous vehicle comfort system development.

Journal Article

Share this book

Add to My Shelf

GAN-Based Generation of Synthetic Data for Vehicle Driving Events

by Hernández-Álvarez, Myriam , Valdivieso Caraguay, Ángel Leonardo , Sanchez-Gordon, Sandra in Accuracy , Algorithms , Computational linguistics

2024

Developing solutions to reduce traffic accidents requires experimentation and much data. However, due to confidentiality issues, not all datasets used in previous research are publicly available, and those that are available may be insufficient for research. Building datasets with real data is costly. Given this reality, this paper proposes a procedure to generate synthetic data sequences of driving events using the Time series GAN (TimeGAN) and Real-world time series (RTSGAN) frameworks. First, a 15-feature driving event dataset is constructed with real data, which forms the basis for generating datasets using the two mentioned frameworks. The generated datasets are evaluated using the qualitative metrics PCA and T-SNE, as well as the discriminative and predictive score quantitative metrics defined in TimeGAN. The generated synthetic data are then used in an unsupervised algorithm to identify clusters representing vehicle crash risk levels. Next, the generated data are used in a supervised classification algorithm to predict risk level categories. Comparison results between the data generated by TimeGAN and RTSGAN show that the data generated by RTSGAN achieve better scores than the the data generated with TimeGAN. On the other hand, we demonstrate that the use of datasets trained with synthetic data to train a supervised classification model for predicting the level of accident risk can obtain accuracy comparable to that of models that use datasets with only real data in their training, proving the usefulness of the generated data.

Journal Article

Share this book

Add to My Shelf

Task-aware conditional GAN with multi-objective loss for realistic and efficient industrial time series generation

by Li, Yonghua , Lang, Kai in Big Data , Communications Engineering , Computational Science and Engineering

2025

Industrial time-series data generation is critical for addressing data scarcity, improving model robustness, and enabling data-driven decision-making in complex manufacturing systems. However, existing generative models often suffer from poor temporal fidelity, limited statistical consistency, and weak adaptability to downstream tasks. To address these challenges, we propose a novel conditional generative adversarial framework that integrates statistical feature augmentation, multi-scale temporal windows, and a composite loss function combining adversarial, L2, DTW, FID, and statistical constraints. This design ensures that the generated data preserves both local dynamics and global distributional properties. Our method introduces a systematic strategy for loss calibration and architecture tuning, which enhances generation stability without the need for complex temporal signature modeling. Experimental results on three real-world industrial datasets demonstrate that our model achieves superior generation fidelity and efficiency compared to state-of-the-art baselines. Specifically, our method achieves the lowest DTW score and the lowest MMD value, surpassing MTSS-GAN and TCGAN by 17.3% and 21.7%, respectively. In terms of training cost, our model achieves a training time reduction of 11.4% compared to MTSS-GAN. These results validate the effectiveness and efficiency of our proposed framework in real-world time-series generation scenarios.

Journal Article

Share this book

Add to My Shelf

How Much Information Does Dependence Between Wavelet Coefficients Contain?

by Jentsch, Carsten , Kirch, Claudia in Analysis of covariance , autocorrelation , Bootstrap inconsistency

2016

This article is motivated by several articles that propose statistical inference where the independence of wavelet coefficients for both short- as well as long-range dependent time series is assumed. We focus on the sample variance and investigate the influence of the dependence between wavelet coefficients and this statistic. To this end, we derive asymptotic distributional properties of the sample variance for a time series that is synthesized, ignoring some or all dependence between wavelet coefficients. We show that the second-order properties differ from the those of the true time series whose wavelet coefficients have the same marginal distribution except in the independent Gaussian case. This holds true even if the dependency is correct within each level and only the dependence between levels is ignored. In the case of sample autocovariances and sample autocorrelations at lag one, we indicate that first-order properties are erroneous. In a second step, we investigate several nonparametric bootstrap schemes in the wavelet domain, which take more and more dependence into account until finally the full dependency is mimicked. We obtain very similar results, where only a bootstrap mimicking the full covariance structure correctly can be valid asymptotically. A simulation study supports our theoretical findings for the wavelet domain bootstraps. For long-range-dependent time series with long-memory parameter d > 1/4, we show that some additional problems occur, which cannot be solved easily without using additional information for the bootstrap. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

SPOT: Testing Stream Processing Programs with Symbolic Execution and Stream Synthesizing

by Lu, Minyan , Ye, Qian in Algorithms , Big Data , Case studies

2021

Adoption of distributed stream processing (DSP) systems such as Apache Flink in real-time big data processing is increasing. However, DSP programs are prone to be buggy, especially when one programmer neglects some DSP features (e.g., source data reordering), which motivates development of approaches for testing and verification. In this paper, we focus on the test data generation problem for DSP programs. Currently, there is a lack of an approach that generates test data for DSP programs with both high path coverage and covering different stream reordering situations. We present a novel solution, SPOT (i.e., Stream Processing Program Test), to achieve these two goals simultaneously. At first, SPOT generates a set of individual test data representing each path of one DSP program through symbolic execution. Then, SPOT composes these independent data into various time series data (a.k.a, stream) in diverse reordering. Finally, we can perform a test by feeding the DSP program with these streams continuously. To automatically support symbolic analysis, we also developed JPF-Flink, a JPF (i.e., Java Pathfinder) extension to coordinate the execution of Flink programs. We present four case studies to illustrate that: (1) SPOT can support symbolic analysis for the commonly used DSP operators; (2) test data generated by SPOT can more efficiently achieve high JDU (i.e., Joint Dataflow and UDF) path coverage than two recent DSP testing approaches; (3) test data generated by SPOT can more easily trigger software failure when comparing with those two DSP testing approaches; and (4) the data randomly generated by those two test techniques are highly skewed in terms of stream reordering, which is measured by the entropy metric. In comparison, it is even for test data from SPOT.

Journal Article

Share this book

Add to My Shelf

The intrinsic predictability of ecological time series and its potential to guide forecasting

by Brose, Ulrich , Williams, Richard , Ward, Colette in Complexity , Computer simulation , CONCEPTS & SYNTHESIS

2019

Successfully predicting the future states of systems that are complex, stochastic, and potentially chaotic is a major challenge. Model forecasting error (FE) is the usual measure of success; however model predictions provide no insights into the potential for improvement. In short, the realized predictability of a specific model is uninformative about whether the system is inherently predictable or whether the chosen model is a poor match for the system and our observations thereof. Ideally, model proficiency would be judged with respect to the systems' intrinsic predictability, the highest achievable predictability given the degree to which system dynamics are the result of deterministic vs. stochastic processes. Intrinsic predictability may be quantified with permutation entropy (PE), a model-free, information-theoretic measure of the complexity of a time series. By means of simulations, we show that a correlation exists between estimated PE and FE and show how stochasticity, process error, and chaotic dynamics affect the relationship. This relationship is verified for a data set of 461 empirical ecological time series. We show how deviations from the expected PE–FE relationship are related to covariates of data quality and the nonlinearity of ecological dynamics. These results demonstrate a theoretically grounded basis for a model-free evaluation of a system's intrinsic predictability. Identifying the gap between the intrinsic and realized predictability of time series will enable researchers to understand whether forecasting proficiency is limited by the quality and quantity of their data or the ability of the chosen forecasting model to explain the data. Intrinsic predictability also provides a model-free baseline of forecasting proficiency against which modeling efforts can be evaluated.

Journal Article

Share this book

Add to My Shelf

A practical guide to selecting models for exploration, inference, and prediction in ecology

by Adler, Peter B. , Ellner, Stephen P. , Hooker, Giles in Best practice , butterflies , Concepts & Synthesis

2021

Selecting among competing statistical models is a core challenge in science. However, the many possible approaches and techniques for model selection, and the conflicting recommendations for their use, can be confusing. We contend that much confusion surrounding statistical model selection results from failing to first clearly specify the purpose of the analysis. We argue that there are three distinct goals for statistical modeling in ecology: data exploration, inference, and prediction. Once the modeling goal is clearly articulated, an appropriate model selection procedure is easier to identify. We review model selection approaches and highlight their strengths and weaknesses relative to each of the three modeling goals. We then present examples of modeling for exploration, inference, and prediction using a time series of butterfly population counts. These show how a model selection approach flows naturally from the modeling goal, leading to different models selected for different purposes, even with exactly the same data set. This review illustrates best practices for ecologists and should serve as a reminder that statistical recipes cannot substitute for critical thinking or for the use of independent data to test hypotheses and validate predictions.

Journal Article

Share this book

Add to My Shelf

The basis function approach for modeling autocorrelation in ecological data

by Kay, Shannon L. , Buderman, Frances E. , Hooten, Mevin B. in Autocorrelation , Basis functions , Bayesian model

2017

Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter