Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
394
result(s) for
"probabilistic principal component analysis"
Sort by:
Robust Probabilistic Multivariate Calibration Model
by
Jeong, Myong K
,
Fang, Yi
in
Applied sciences
,
Calibration
,
Computer science; control theory; systems
2008
In this article we propose a robust probabilistic multivariate calibration (RPMC) model in an attempt to identify linear relationships between two sets of observed variables contaminated with outliers. Instead of the Gaussian assumptions that predominate in classical statistical models, RPMC is closely related with the multivariate Student t-distribution over noises and latent variables. Thus RPMC diminishes the effect of outlying data points by regulating the thickness of the distribution tails. RPMC is essentially a robustified version of the supervised probabilistic principal component analysis (SPPCA) that has emerged recently. We show that RPMC encompasses probabilistic principal component analysis and SPPCA as limiting cases. We also derive an efficient EM algorithm for parameter estimation in RPMC. Based on a probabilistic description of latent variables, we present a procedure for the detection of outliers. The experimental results from both simulated examples and real life data sets demonstrate the effectiveness and robustness of our proposed approach.
Journal Article
On estimation of the noise variance in high dimensional probabilistic principal component analysis
2017
We develop new statistical theory for probabilistic principal component analysis models in high dimensions. The focus is the estimation of the noise variance, which is an important and unresolved issue when the number of variables is large in comparison with the sample size. We first unveil the reasons for an observed downward bias of the maximum likelihood estimator of the noise variance when the data dimension is high. We then propose a bias-corrected estimator by using random-matrix theory and establish its asymptotic normality. The superiority of the new and bias-corrected estimator over existing alternatives is checked by Monte Carlo experiments with various combinations of (p, n) (the dimension and sample size). Next, we construct a new criterion based on the bias-corrected estimator to determine the number of the principal components, and a consistent estimator is obtained. Its good performance is confirmed by a simulation study and real data analysis. The bias-corrected estimator is also used to derive new asymptotics for the related goodness-of-fit statistic under the high dimensional scheme.
Journal Article
Effective application of biosensor analytical techniques in drug testing
2024
This study explores biosensor technology, focusing on its application in drug detection through advanced quantitative analysis methods: partial least squares (PLS) and probabilistic principal component analysis (PPCA). We developed a rapid quantitative calibration model using azure A, B, and C—metabolites of pefloxacin mesylate and methylene blue— demonstrated through surface-enhanced Raman spectroscopy. The findings highlight the superior accuracy of PLS and PPCA in predicting drug concentrations, with pefloxacin mesylate detection deviations maintained between 0.24%-0.98% and 0.35%-1.02%, respectively. PLS proved to be slightly more effective. This study confirms the potential of biosensor technology in ensuring drug safety, offering substantial support for public health protection and regulatory compliance.
Journal Article
Deep learning for rapid crop damage assessment after cyclones
2025
In the wake of devastating cyclones, rapid and accurate assessment of crop damage is crucial for timely intervention and resource allocation. The acquiring of high-quality and up-to-date satellite or aerial imagery immediately following a cyclone is often difficult due to adverse weather conditions and limited access to affected areas. The objective of this study is to develop and validate an advanced deep-learning framework capable of rapidly and accurately assessing crop damage caused by cyclones using high-resolution satellite and aerial imagery. Grey-level co-occurrence Matrix (GLCM) is applied to standardize and enhance the contrast of satellite images, improving their suitability. Feature extraction techniques, such as Mixture of Probabilistic Principal Component Analysis (MPPCA), are employed to identify key characteristics indicative of crop damage. The next step is to develop a CNN model to classify crop damage levels based on the extracted features from satellite imagery. The two approaches, Unsupervised Anomaly Detection (UAD) with the GOA-CNN-BiGRU-Attention (GCBA) framework and Generative Adversarial Networks (GANs) for synthetic pre-cyclone imagery offer innovative solutions for rapid and accurate crop damage assessment after cyclones. The findings show the proposed model accuracy appears to attain a higher accuracy of 99%. Future developments in deep learning to quickly estimate agricultural damage following storms could focus on integrating multispectral and hyperspectral imagery for enhanced detection accuracy and incorporating real-time data streams from drones or IoT devices to provide timely and precise assessments during and immediately after cyclonic events.
Journal Article
Missing traffic data: comparison of imputation methods
2014
Many traffic management and control applications require highly complete and accurate data of traffic flow. However, because of various reasons such as sensor failure or transmission error, it is common that some traffic flow data are lost. As a result, various methods were proposed by using a wide spectrum of techniques to estimate missing traffic data in the last two decades. Generally, these missing data imputation methods can be categorised into three kinds: prediction methods, interpolation methods and statistical learning methods. To assess their performance, these methods are compared from different aspects in this paper, including reconstruction errors, statistical behaviours and running speeds. Results show that statistical learning methods are more effective than the other two kinds of imputation methods when data of a single detector is utilised. Among various methods, the probabilistic principal component analysis (PPCA) yields best performance in all aspects. Numerical tests demonstrate that PPCA can be used to impute data online before making further analysis (e.g. make traffic prediction) and is robust to weather changes.
Journal Article
Torus Probabilistic Principal Component Analysis
by
Golalizadeh, Mousa
,
Maadooliat, Mehdi
,
Nodehi, Anahita
in
Algorithms
,
Bioinformatics
,
Biology
2025
Analyzing data in non-Euclidean spaces, such as bioinformatics, biology, and geology, where variables represent directions or angles, poses unique challenges. This type of data is known as circular data in univariate cases and can be termed spherical or toroidal in multivariate contexts. In this paper, we introduce a novel extension of probabilistic principal component analysis (PPCA) designed for toroidal (or torus) data, termed torus probabilistic PCA (TPPCA). We provide detailed algorithms for implementing TPPCA and demonstrate its applicability to torus data. To assess the efficacy of TPPCA, we perform comparative analyses using a simulation study and three real datasets. Our findings highlight the advantages and limitations of TPPCA in handling torus data. Furthermore, we propose statistical tests based on likelihood ratio statistics to determine the optimal number of components, enhancing the practical utility of TPPCA for real-world applications.
Journal Article
Research on Arch Dam Deformation Safety Early Warning Method Based on Effect Separation of Regional Environmental Variables and Knowledge-Driven Approach
2025
There are significant differences in the deformation patterns of different parts of arch dams, and there is a common situation of periodic data loss. To accurately analyze the deformation behavior of arch dams, this paper proposes a safety warning and anomaly diagnosis method for arch dam deformation based on the separation of environmental variable effects in different partitions and a knowledge-driven approach. This method combines various techniques such as an optimized ISODATA clustering method, probabilistic principal component analysis (PPCA), square prediction error (SPE) norm control chart, and contribution chart. By defining data forms and rules, existing engineering specifications and experience are transformed into “knowledge” and applied to the operation and management of arch dams, achieving accurate monitoring of arch dam deformation status and timely diagnosis of outliers. Through monitoring data verification of horizontal displacement in a certain arch dam partition, the results show that this method can accurately identify deformation anomalies in the arch dam and effectively separate the influence of environmental variables and noise interference, providing strong support for the safe operation of the arch dam. Accurate deformation monitoring of arch dams is essential for ensuring structural safety and optimizing operational management. However, conventional early warning indicators and empirical models often fail to capture the spatial heterogeneity of deformation and the complex coupling between environmental variables and structural responses. To overcome these limitations, this study proposes a knowledge-driven safety early warning and anomaly diagnosis model for arch dam deformation, based on spatiotemporal clustering and partitioned environmental variable separation. The method integrates the optimized ISODATA clustering algorithm, probabilistic principal component analysis (PPCA), squared prediction error (SPE) control chart, and contribution chart to establish a comprehensive monitoring framework. The optimized ISODATA identifies deformation zones with similar mechanical behavior, PPCA separates environmental influences such as temperature and reservoir level from structural responses, and the SPE and contribution charts quantify abnormal variations and locate potential risk regions. Application of the proposed method to long-term deformation monitoring data demonstrates that the PPCA-based framework effectively separates environmental effects, improves the interpretability of zoned deformation characteristics, and enhances the accuracy and reliability of anomaly identification compared with conventional approaches. These findings indicate that the proposed knowledge-driven model provides a robust and interpretable framework for precise deformation safety evaluation of arch dams.
Journal Article
A penalization method to estimate the intrinsic dimensionality of data
by
Rodriguez, Daniela
,
Sued, Mariela
,
Forzani, Liliana
in
Computer science
,
Eigenvalues
,
Estimation
2025
We propose a novel penalization method for estimating the intrinsic dimensionality of data within a Probabilistic Principal Components Model, extending beyond the Gaussian case. Unlike existing approaches, our method is designed to handle non-normal data, providing a flexible alternative to traditional factor models. Our procedure identifies the dimension at which the eigenvalues of a scatter matrix stabilize. We establish the consistency of the procedure under mild conditions and demonstrate its robustness across a range of data distributions. A comparative analysis highlights its advantages over existing techniques, making it a valuable tool for dimensionality estimation without relying on distributional assumptions.
Journal Article
Survey on Probabilistic Models of Low-Rank Matrix Factorizations
2017
Low-rank matrix factorizations such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD) and Non-negative Matrix Factorization (NMF) are a large class of methods for pursuing the low-rank approximation of a given data matrix. The conventional factorization models are based on the assumption that the data matrices are contaminated stochastically by some type of noise. Thus the point estimations of low-rank components can be obtained by Maximum Likelihood (ML) estimation or Maximum a posteriori (MAP). In the past decade, a variety of probabilistic models of low-rank matrix factorizations have emerged. The most significant difference between low-rank matrix factorizations and their corresponding probabilistic models is that the latter treat the low-rank components as random variables. This paper makes a survey of the probabilistic models of low-rank matrix factorizations. Firstly, we review some probability distributions commonly-used in probabilistic models of low-rank matrix factorizations and introduce the conjugate priors of some probability distributions to simplify the Bayesian inference. Then we provide two main inference methods for probabilistic low-rank matrix factorizations, i.e., Gibbs sampling and variational Bayesian inference. Next, we classify roughly the important probabilistic models of low-rank matrix factorizations into several categories and review them respectively. The categories are performed via different matrix factorizations formulations, which mainly include PCA, matrix factorizations, robust PCA, NMF and tensor factorizations. Finally, we discuss the research issues needed to be studied in the future.
Journal Article