Catalogue Search | MBRL

Simplifying the representation of complex free-energy landscapes using sketch-map

by Tribello, Gareth A , Ceriotti, Michele , Parrinello, Michele in Algorithms , Dimensionality , Dimensionality reduction

2011

A new scheme, sketch-map, for obtaining a low-dimensional representation of the region of phase space explored during an enhanced dynamics simulation is proposed. We show evidence, from an examination of the distribution of pairwise distances between frames, that some features of the free-energy surface are inherently high-dimensional. This makes dimensionality reduction problematic because the data does not satisfy the assumptions made in conventional manifold learning algorithms We therefore propose that when dimensionality reduction is performed on trajectory data one should think of the resultant embedding as a quickly sketched set of directions rather than a road map. In other words, the embedding tells one about the connectivity between states but does not provide the vectors that correspond to the slow degrees of freedom. This realization informs the development of sketch-map, which endeavors to reproduce the proximity information from the high-dimensionality description in a space of lower dimensionality even when a faithful embedding is not possible.

Journal Article

Share this book

Add to My Shelf

Demixed principal component analysis of neural population data

by Qi, Xue-Lian , Constantinidis, Christos , Romo, Ranulfo in Animals , Datasets as Topic , Decision Making - physiology

2016

Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure. Many neuroscience experiments today involve using electrodes to record from the brain of an animal, such as a mouse or a monkey, while the animal performs a task. The goal of such experiments is to understand how a particular brain region works. However, modern experimental techniques allow the activity of hundreds of neurons to be recorded simultaneously. Analysing such large amounts of data then becomes a challenge in itself. This is particularly true for brain regions such as the prefrontal cortex that are involved in the cognitive processes that allow an animal to acquire knowledge. Individual neurons in the prefrontal cortex encode many different types of information relevant to a given task. Imagine, for example, that an animal has to select one of two objects to obtain a reward. The same group of prefrontal cortex neurons will encode the object presented to the animal, the animal’s decision and its confidence in that decision. This simultaneous representation of different elements of a task is called a ‘mixed’ representation, and is difficult to analyse. Kobak, Brendel et al. have now developed a data analysis tool that can ‘demix’ neural activity. The tool breaks down the activity of a population of neurons into its individual components. Each of these relates to only a single aspect of the task and is thus easier to interpret. Information about stimuli, for example, is distinguished from information about the animal’s confidence levels. Kobak, Brendel et al. used the demixing tool to reanalyse existing datasets recorded from several different animals, tasks and brain regions. In each case, the tool provided a complete, concise and transparent summary of the data. The next steps will be to apply the analysis tool to new datasets to see how well it performs in practice. At a technical level, the tool could also be extended in a number of different directions to enable it to deal with more complicated experimental designs in future.

Journal Article

Share this book

Add to My Shelf

Non-linear dimensionality reduction on extracellular waveforms reveals cell type diversity in premotor cortex

by Lee, Eric Kenji , Shenoy, Krishna V , Anakwe, Stephanie Udochukwu in Animals , Behavior , cell types

2021

Cortical circuits are thought to contain a large number of cell types that coordinate to produce behavior. Current in vivo methods rely on clustering of specified features of extracellular waveforms to identify putative cell types, but these capture only a small amount of variation. Here, we develop a new method ( WaveMAP ) that combines non-linear dimensionality reduction with graph clustering to identify putative cell types. We apply WaveMAP to extracellular waveforms recorded from dorsal premotor cortex of macaque monkeys performing a decision-making task. Using WaveMAP , we robustly establish eight waveform clusters and show that these clusters recapitulate previously identified narrow- and broad-spiking types while revealing previously unknown diversity within these subtypes. The eight clusters exhibited distinct laminar distributions, characteristic firing rate patterns, and decision-related dynamics. Such insights were weaker when using feature-based approaches. WaveMAP therefore provides a more nuanced understanding of the dynamics of cell types in cortical circuits.

Journal Article

Share this book

Add to My Shelf

Robust Methods for Data Reduction

by Farcomeni, Alessio , Greco, Luca in Computer programs , Data reduction , Dimension reduction (Statistics)

2016,2015

This book gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis. Using real examples, the authors show how to implement the procedures in R. The code and data for the examples are available on the book's CRC Press web page.

eBook

Share this book

Add to My Shelf

Shape-aware stochastic neighbor embedding for robust data visualisations

by Wängberg, Tobias , Tyrcha, Joanna , Li, Chun-Biu in Algorithms , Bioinformatics , Biomedical and Life Sciences

2022

Background The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm has emerged as one of the leading methods for visualising high-dimensional (HD) data in a wide variety of fields, especially for revealing cluster structure in HD single-cell transcriptomics data. However, t-SNE often fails to correctly represent hierarchical relationships between clusters and creates spurious patterns in the embedding. In this work we generalised t-SNE using shape-aware graph distances to mitigate some of the limitations of the t-SNE. Although many methods have been recently proposed to circumvent the shortcomings of t-SNE, notably Uniform manifold approximation (UMAP) and Potential of heat diffusion for affinity-based transition embedding (PHATE), we see a clear advantage of the proposed graph-based method. Results The superior performance of the proposed method is first demonstrated on simulated data, where a significant improvement compared to t-SNE, UMAP and PHATE, based on quantitative validation indices, is observed when visualising imbalanced, nonlinear, continuous and hierarchically structured data. Thereafter the ability of the proposed method compared to the competing methods to create faithfully low-dimensional embeddings is shown on two real-world data sets, the single-cell transcriptomics data and the MNIST image data. In addition, the only hyper-parameter of the method can be automatically chosen in a data-driven way, which is consistently optimal across all test cases in this study. Conclusions In this work we show that the proposed shape-aware stochastic neighbor embedding method creates low-dimensional visualisations that robustly and accurately reveal key structures of high-dimensional data.

Journal Article

Share this book

Add to My Shelf

Improving IoT Security: The Impact of Dimensionality and Size Reduction on Intrusion Detection Performance

by Nailah Al-madi , Amal Saif , Remah Younisse in dimensionality reduction; data reduction; autoencoders; stratified sampling; machine learning

2025

Intrusion detection in the Internet of Things (IoT) environments is essential to guarantee computer network security. Machine learning (ML) models are widely used to improve efficient detection systems. Meanwhile, with the increasing complexity and size of intrusion detection data, analyzing vast datasets using ML models is becoming more challenging and demanding in terms of computational resources. Datasets related to IoT environments usually come in very large sizes. This study investigates the impact of dataset reduction techniques on machine learning-based Intrusion Detection Systems (IDS) performance and efficiency. We propose a two-stage framework incorporating deep autoencoder-based feature reduction with stratified sampling to reduce the dimensionality and size of six publicly available IDS datasets, including BoT-IoT, CSE-CIC-IDS2018, and others. Multiple machine learning models, such as Random Forest, XGBoost, K-Nearest Neighbors, SVM, and AdaBoost, were evaluated using default parameters. Our results show that dataset reduction can decrease training time by up to 99% with minimal loss in F1-score, typically less than 1%. It is recognized that excessive size reduction can compromise detection accuracy for minority attack classes. However, employing a stratified sampling method can effectively maintain class distributions. The study highlights significant feature redundancy, particularly high correlation among features, across multiple IoT security-related datasets, motivating the use of dimensionality reduction techniques. These findings support the feasibility of efficient, scalable IDS implementations for real-world environments, especially in resource-constrained or real-time settings. [JJCIT 2025; 11(3.000): 351-368]

Journal Article

Share this book

Add to My Shelf

A Clustering-Based Dimensionality Reduction Method Guided by POD Structures and Its Application to Convective Flow Problems

by Yuan, Qingyang , Zhang, Bo in Approximation , Burgers equation , Clustering

2025

Proper orthogonal decomposition (POD) is a widely used linear dimensionality reduction technique, but it often fails to capture critical features in complex nonlinear flows. In contrast, clustering methods are effective for nonlinear feature extraction, yet their application in dimensionality reduction methods is hindered by unstable cluster initialization and inefficient mode sorting. To address these issues, we propose a clustering-based dimensionality reduction method guided by POD structures (C-POD), which uses POD preprocessing to stabilize the selection of cluster centers. Additionally, we introduce an entropy-controlled Euclidean-to-probability mapping (ECEPM) method to improve modal sorting and assess mode importance. The C-POD approach is evaluated using the one-dimensional Burgers’ equation and a two-dimensional cylinder wake flow. Results show that C-POD achieves higher accuracy in dimensionality reduction than POD. Its dominant modes capture more temporal dynamics, while higher-order modes offer better physical interpretability. When solving an inverse problem using sparse sensor data, the Gappy C-POD method improves reconstruction accuracy by 19.75% and enhances the lower bound of reconstruction capability by 13.4% compared to Gappy POD. Overall, C-POD demonstrates strong potential for modeling and reconstructing complex nonlinear flow fields, providing a valuable tool for dimensionality reduction methods in fluid dynamics.

Journal Article

Share this book

Add to My Shelf

Facial Expression Recognition Based on Local Binary Patterns and Kernel Discriminant Isomap

by Zhao, Xiaoming , Zhang, Shiqing in Algorithms , Artificial Intelligence , Databases as Topic

2011

Facial expression recognition is an interesting and challenging subject. Considering the nonlinear manifold structure of facial images, a new kernel-based manifold learning method, called kernel discriminant isometric mapping (KDIsomap), is proposed. KDIsomap aims to nonlinearly extract the discriminant information by maximizing the interclass scatter while minimizing the intraclass scatter in a reproducing kernel Hilbert space. KDIsomap is used to perform nonlinear dimensionality reduction on the extracted local binary patterns (LBP) facial features, and produce low-dimensional discrimimant embedded data representations with striking performance improvement on facial expression recognition tasks. The nearest neighbor classifier with the Euclidean metric is used for facial expression classification. Facial expression recognition experiments are performed on two popular facial expression databases, i.e., the JAFFE database and the Cohn-Kanade database. Experimental results indicate that KDIsomap obtains the best accuracy of 81.59% on the JAFFE database, and 94.88% on the Cohn-Kanade database. KDIsomap outperforms the other used methods such as principal component analysis (PCA), linear discriminant analysis (LDA), kernel principal component analysis (KPCA), kernel linear discriminant analysis (KLDA) as well as kernel isometric mapping (KIsomap).

Journal Article

Share this book

Add to My Shelf

Ultralow‐Dimensionality Reduction for Identifying Critical Transitions by Spatial‐Temporal PCA

by Liu, Rui , Suo, Yaofang , Li, Ye in critical state transition , Datasets , Dynamical systems

2025

Discovering dominant patterns and exploring dynamic behaviors especially critical state transitions and tipping points in high‐dimensional time‐series data are challenging tasks in study of real‐world complex systems, which demand interpretable data representations to facilitate comprehension of both spatial and temporal information within the original data space. This study proposes a general and analytical ultralow‐dimensionality reduction method for dynamical systems named spatial‐temporal principal component analysis (stPCA) to fully represent the dynamics of a high‐dimensional time‐series by only a single latent variable without distortion, which transforms high‐dimensional spatial information into one‐dimensional temporal information based on nonlinear delay‐embedding theory. The dynamics of this single variable is analytically solved and theoretically preserves the temporal property of original high‐dimensional time‐series, thereby accurately and reliably identifying the tipping point before an upcoming critical transition. Its applications to real‐world datasets such as individual‐specific heterogeneous ICU records demonstrate the effectiveness of stPCA, which quantitatively and robustly provides the early‐warning signals of the critical/tipping state on each patient. The proposed spatial‐temporal principal component analysis (stPCA) method analytically reduces high‐dimensional time‐series data to a single latent variable by transforming spatial information into temporal dynamics. By preserving the temporal properties of the original data, stPCA effectively identifies critical transitions and tipping points. It provides robust early‐warning signals, demonstrating effectiveness on both simulation and real‐world datasets.

Journal Article

Share this book

Add to My Shelf

Stochastic Neighbor Embedding Feature-Based Hyperspectral Image Classification Using 3D Convolutional Neural Network

by Yuichi Okuyama , Abu Saleh Musa Miah , Yoichi Tomioka in Accuracy , Algorithms , Artificial neural networks

2023

The ample amount of information from hyperspectral image (HSI) bands allows the non-destructive detection and recognition of earth objects. However, dimensionality reduction (DR) of hyperspectral images (HSI) is required before classification as the classifier may suffer from the curse of dimensionality. Therefore, dimensionality reduction plays a significant role in HSI data analysis (e.g., effective processing and seamless interpretation). In this article, a sophisticated technique established as t-Distributed Stochastic Neighbor Embedding (tSNE) following the dimension reduction along with a blended CNN was implemented to improve the visualization and characterization of HSI. In the procedure, first, we employed principal component analysis (PCA) to reduce the HSI dimensions and remove non-linear consistency features between the wavelengths to project them to a smaller scale. Then we proposed tSNE to preserve the local and global pixel relationships and check the HSI information visually and experimentally. Lastly, it yielded two-dimensional data, improving the visualization and classification accuracy compared to other standard dimensionality-reduction algorithms. Finally, we employed deep-learning-based CNN to classify the reduced and improved HSI intra- and inter-band relationship-feature vector. The evaluation performance of 95.21% accuracy and 6.2% test loss proved the superiority of the proposed model compared to other state-of-the-art DR reduction algorithms.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter