Catalogue Search | MBRL

Simplifying the representation of complex free-energy landscapes using sketch-map

by Tribello, Gareth A , Ceriotti, Michele , Parrinello, Michele in Algorithms , Dimensionality , Dimensionality reduction

2011

A new scheme, sketch-map, for obtaining a low-dimensional representation of the region of phase space explored during an enhanced dynamics simulation is proposed. We show evidence, from an examination of the distribution of pairwise distances between frames, that some features of the free-energy surface are inherently high-dimensional. This makes dimensionality reduction problematic because the data does not satisfy the assumptions made in conventional manifold learning algorithms We therefore propose that when dimensionality reduction is performed on trajectory data one should think of the resultant embedding as a quickly sketched set of directions rather than a road map. In other words, the embedding tells one about the connectivity between states but does not provide the vectors that correspond to the slow degrees of freedom. This realization informs the development of sketch-map, which endeavors to reproduce the proximity information from the high-dimensionality description in a space of lower dimensionality even when a faithful embedding is not possible.

Journal Article

Share this book

Add to My Shelf

Non-linear dimensionality reduction on extracellular waveforms reveals cell type diversity in premotor cortex

by Lee, Eric Kenji , Shenoy, Krishna V , Anakwe, Stephanie Udochukwu in Animals , Behavior , cell types

2021

Cortical circuits are thought to contain a large number of cell types that coordinate to produce behavior. Current in vivo methods rely on clustering of specified features of extracellular waveforms to identify putative cell types, but these capture only a small amount of variation. Here, we develop a new method ( WaveMAP ) that combines non-linear dimensionality reduction with graph clustering to identify putative cell types. We apply WaveMAP to extracellular waveforms recorded from dorsal premotor cortex of macaque monkeys performing a decision-making task. Using WaveMAP , we robustly establish eight waveform clusters and show that these clusters recapitulate previously identified narrow- and broad-spiking types while revealing previously unknown diversity within these subtypes. The eight clusters exhibited distinct laminar distributions, characteristic firing rate patterns, and decision-related dynamics. Such insights were weaker when using feature-based approaches. WaveMAP therefore provides a more nuanced understanding of the dynamics of cell types in cortical circuits.

Journal Article

Share this book

Add to My Shelf

A tractable latent variable model for nonlinear dimensionality reduction

by Saul, Lawrence K. in Applied Mathematics , Embedding , Granulation

2020

We propose a latent variable model to discover faithful low-dimensional representations of high-dimensional data. The model computes a low-dimensional embedding that aims to preserve neighborhood relationships encoded by a sparse graph. The model both leverages and extends current leading approaches to this problem. Like t-distributed Stochastic Neighborhood Embedding, the model can produce two- and three-dimensional embeddings for visualization, but it can also learn higher-dimensional embeddings for other uses. Like LargeVis and Uniform Manifold Approximation and Projection, the model produces embeddings by balancing two goals—pulling nearby examples closer together and pushing distant examples further apart. Unlike these approaches, however, the latent variables in our model provide additional structure that can be exploited for learning. We derive an Expectation–Maximization procedure with closed-form updates that monotonically improve the model’s likelihood: In this procedure, embeddings are iteratively adapted by solving sparse, diagonally dominant systems of linear equations that arise from a discrete graph Laplacian. For large problems, we also develop an approximate coarse-graining procedure that avoids the need for negative sampling of nonadjacent nodes in the graph. We demonstrate the model’s effectiveness on datasets of images and text.

Journal Article

Share this book

Add to My Shelf

Machine-learning certification of multipartite entanglement for noisy quantum hardware

by Seo, Seungchan , Bae, Joonwoo , Fuchs, Andreas J C in Certification , entanglement certification , Hardware

2025

Entanglement is a fundamental aspect of quantum physics, both conceptually and for its many applications. Classifying an arbitrary multipartite state as entangled or separable—a task referred to as the separability problem—poses a significant challenge, since a state can be entangled with respect to many different of its partitions. We develop a certification pipeline that feeds the statistics of random local measurements into a non-linear dimensionality reduction algorithm, to determine with respect to which partitions a given quantum state is entangled. After training a model on randomly generated quantum states, entangled in different partitions and of varying purity, we verify the accuracy of its predictions on simulated test data, and finally apply it to states prepared on IBM quantum computing hardware.

Journal Article

Share this book

Add to My Shelf

Augmented Human Intelligence and Automated Diagnosis in Flow Cytometry for Hematologic Malignancies

by Zuromski, Lauren M , Ng, David P in Analysis , Automation , B-Lymphocytes - pathology

2021

Abstract Objectives Clinical flow cytometry is laborious, time-consuming, and expensive given the need for data review by highly trained personnel such as technologists and pathologists as well as the significant number of normal cases. Given these issues, automation in analysis and diagnosis holds the key to major efficiency gains. The objective was to design an automated pipeline for the diagnosis of B-cell malignancies in flow cytometry and evaluate its performance against our standard clinical diagnostic flow cytometry process. Methods Using 3,417 cases of peripheral blood data over 6 months from our 10-color B-cell screening tube, we used a newly described method for feature extraction and dimensionality reduction called UMAP on the raw flow cytometry data followed by random forest classification to classify cases without gating on specific population. Results Our automated classifier was able to achieve greater than 95% accuracy in diagnosing all B-cell malignancies, and even better performance for specific malignancies for which the panel was designed, such as chronic lymphocytic leukemia. By adjusting classifier cutoffs, 100% sensitivity could be achieved with an albeit low 14% specificity. Hypothetically, this would allow 11% of the cases to be autoverified without human intervention. Conclusions These results suggest that a clinical implementation of this pipeline can greatly assist in quality control, improve turnaround time, and decrease staff workloads.

Journal Article

Share this book

Add to My Shelf

Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment

by Zha, Hongyuan , Zhang, Zhenyue in Algorithms , Error analysis , Geometry

2004

We present a new algorithm for manifold learning and nonlinear dimensionality reduction. Based on a set of unorganized data points sampled with noise from a parameterized manifold, the local geometry of the manifold is learned by constructing an approximation for the tangent space at each data point, and those tangent spaces are then aligned to give the global coordinates of the data points with respect to the underlying manifold. We also present an error analysis of our algorithm showing that reconstruction errors can be quite small in some cases. We illustrate our algorithm using curves and surfaces both in two-dimensional/three-dimensional (2D/3D) Euclidean spaces and in higher-dimensional Euclidean spaces. We also address several theoretical and algorithmic issues for further research and improvements.

Journal Article

Share this book

Add to My Shelf

Comparison of manifold learning algorithms for identifying geochemical anomalies associated with copper mineralization

by Zou, Siyi , Zhu, Haotian , Min, Yuwen in 704/2151 , 704/445/209 , Algorithms

2025

The Baiyin district, situated within the northern Qilian orogenic belt, hosts the largest concentration of copper mineral resources in Gansu Province, Northwestern China. Geochemical anomaly patterns are crucial indicators for mineral exploration in this region; however, they are frequently concealed within complex high-dimensional geochemical datasets. Moreover, the scarcity of labeled samples often restricts the effectiveness of supervised machine learning methods for accurate geochemical pattern recognition. This study utilizes unsupervised manifold learning algorithms, including Uniform Manifold Approximation and Projection (UMAP), t-Distributed Stochastic Neighbor Embedding (t-SNE), Isometric Mapping (Isomap), and Locally Linear Embedding (LLE) for identifying low-dimensional features closely associated with mineralization from high-dimensional geochemical datasets. The manifold learning algorithms were optimized by adjusting their key parameters through Receiver Operating Characteristic (ROC) test analysis to achieve optimal performance. The analytical results demonstrate that: (1) manifold learning algorithms exhibited superior performance over conventional factor analysis in accurately capturing complex nonlinear geochemical patterns; (2) The ROC curve and Area Under the Curve (AUC) values for the manifold learning algorithms were UMAP (0.711), t-SNE (0.693), Isomap (0.691), and LLE (0.652), indicating that the UMAP algorithm is the most suitable for identifying geochemical anomaly patterns in the study area; the prediction-area(P-A) analysis further confirmed the UMAP-derived anomalies with a relatively higher prediction efficiency; (3) manifold learning-driven high-probability zones exhibit significant spatial correlations with known mineral deposits, fault structures, and ore-bearing volcanic rock formations. These results highlight the superior capability of manifold learning techniques in extracting meaningful non-linear geochemical anomalies for further exploration of mineral resources.

Journal Article

Share this book

Add to My Shelf

Research on Intelligent Recognition Model of English Translation Based on Nonlinear Dimensionality Reduction IC Features

by You, Shanshan in 97C50 , English translation , IC features

2024

This paper first aligns English translations with other language texts to build an English translation corpus, then trains the model for the pending translations to obtain the final target model, then enumerates all possible combinations of source language phrases and target language phrases, and filters out unsatisfied phrase translation pairs to achieve phrase extraction. And the translation model is non-linearly dimensionalized to reduce the complexity of the operation process. Finally, the dimensionality reduction effect of the data and the effect of the model translation are analyzed. The results show that the cumulative contribution rate of the t-SNE algorithm is over 95%, which can guarantee no loss of translation information. The translation accuracy of this paper’s algorithm on each language block is basically 85%-90%, the recall rate is above 85%, and the F-value is above 82%. It indicates that the method in this paper can be well adapted to the requirements of intelligent recognition of English translation.

Journal Article

Share this book

Add to My Shelf

Crowd‐sourced plant occurrence data provide a reliable description of macroecological gradients

by Mäder, Patrick , Rzanny, Michael , Seeland, Marco in Algorithms , Applications programs , automated species identification

2021

Deep learning algorithms classify plant species with high accuracy, and smartphone applications leverage this technology to enable users to identify plant species in the field. The question we address here is whether such crowd‐sourced data contain substantial macroecological information. In particular, we aim to understand if we can detect known environmental gradients shaping plant co‐occurrences. In this study we analysed 1 million data points collected through the use of the mobile app Flora Incognita between 2018 and 2019 in Germany and compared them with Florkart, containing plant occurrence data collected by more than 5000 floristic experts over a 70‐year period. The direct comparison of the two data sets reveals that the crowd‐sourced data particularly undersample areas of low population density. However, using nonlinear dimensionality reduction we were able to uncover macroecological patterns in both data sets that correspond well to each other. Mean annual temperature, temperature seasonality and wind dynamics as well as soil water content and soil texture represent the most important gradients shaping species composition in both data collections. Our analysis describes one way of how automated species identification could soon enable near real‐time monitoring of macroecological patterns and their changes, but also discusses biases that must be carefully considered before crowd‐sourced biodiversity data can effectively guide conservation measures.

Journal Article

Share this book

Add to My Shelf

Nonlinear dimensionality reduction method of scheduling frequent information in wireless networks based on multilevel mapping

by Woźniak, Marcin , Sun, Jian-zhao , Yang, Kun in Algorithms , Bit error rate , Deletion

2023

In order to reduce the redundant data deletion accuracy and bit error rate of wireless network scheduling information, make the reduced dimension information uniformly distributed on the manifold, and eliminate noise holes, a nonlinear dimensionality reduction method for wireless network scheduling frequent information based on multi-level mapping is proposed in this study. Gaussian mixture model is used to analyze the characteristics of redundant network scheduling frequent information data, and the feature coding is quantized and decomposed. On this basis, fractional Fourier transform is used to delete redundant network scheduling frequent information data. The multi-level mapping theory is introduced, and the isometric mapping algorithm is used to implement nonlinear dimensionality reduction processing for the wireless network scheduling frequent information after deleting redundant data. Aiming at the problem of reducing the effect of nonlinear dimensionality reduction under the condition of sparse data, the local linear embedding algorithm is used for secondary dimensionality reduction. Experimental results show that the proposed method can effectively delete redundant data under the conditions of dense and sparse data, and the error rate of redundant data deletion is less than 1.5%; Moreover, the integrity of the data distribution on the manifold in the low dimensional embedded space can be avoided.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter