Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
24,026
result(s) for
"Data alignment"
Sort by:
DA_2DCHROM — a data alignment tool for applications on real GC × GC–TOF samples
by
Urban, Štěpán
,
Ladislavová, Nikola
,
Pojmanová, Petra
in
Algorithms
,
Alignment
,
Analytical Chemistry
2023
Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC–MS) has great potential for analyses of complicated mixtures and sample matrices, due to its separation power and possible high resolution. The second component of the measurement results, the mass spectra, is reproducible. However, the reproducibility of two-dimensional chromatography is affected by many factors and makes the evaluation of long-term experiments or cross-laboratory collaborations complicated. This paper presents a new open-source data alignment tool to tackle the problem of retention time shifts — with 5 different algorithms implemented: BiPACE 2D, DISCO, MSort, PAM, and TNT-DA, along with Pearson’s correlation and dot product as optional methods for mass spectra comparison. The implemented data alignment algorithms and their variations were tested on real samples to demonstrate the functionality of the presented tool. The suitability of each implemented algorithm for significantly/non-significantly shifted data was discussed on the basis of the results obtained. For the evaluation of the “goodness” of the alignment, Kolmogorov–Smirnov test values were calculated, and comparison graphs were generated. The DA_2DChrom is available online with its documentation, fully open-sourced, and the user can use the tool without the need of uploading their data to external third-party servers.
Graphical Abstract
Journal Article
Survey of Datafusion Techniques for Laser and Vision Based Sensor Integration for Autonomous Navigation
2020
This paper focuses on data fusion, which is fundamental to one of the most important modules in any autonomous system: perception. Over the past decade, there has been a surge in the usage of smart/autonomous mobility systems. Such systems can be used in various areas of life like safe mobility for the disabled, senior citizens, and so on and are dependent on accurate sensor information in order to function optimally. This information may be from a single sensor or a suite of sensors with the same or different modalities. We review various types of sensors, their data, and the need for fusion of the data with each other to output the best data for the task at hand, which in this case is autonomous navigation. In order to obtain such accurate data, we need to have optimal technology to read the sensor data, process the data, eliminate or at least reduce the noise and then use the data for the required tasks. We present a survey of the current data processing techniques that implement data fusion using different sensors like LiDAR that use light scan technology, stereo/depth cameras, Red Green Blue monocular (RGB) and Time-of-flight (TOF) cameras that use optical technology and review the efficiency of using fused data from multiple sensors rather than a single sensor in autonomous navigation tasks like mapping, obstacle detection, and avoidance or localization. This survey will provide sensor information to researchers who intend to accomplish the task of motion control of a robot and detail the use of LiDAR and cameras to accomplish robot navigation.
Journal Article
A Comparative Review of Manifold Learning Techniques for Hyperspectral and Polarimetric SAR Image Fusion
2019
In remote sensing, hyperspectral and polarimetric synthetic aperture radar (PolSAR) images are the two most versatile data sources for a wide range of applications such as land use land cover classification. However, the fusion of these two data sources receive less attention than many other, because of their scarce data availability, and relatively challenging fusion task caused by their distinct imaging geometries. Among the existing fusion methods, including manifold learning-based, kernel-based, ensemble-based, and matrix factorization, manifold learning is one of most celebrated techniques for the fusion of heterogeneous data. Therefore, this paper aims to promote the research in hyperspectral and PolSAR data fusion, by providing a comprehensive comparison between existing manifold learning-based fusion algorithms. We conducted experiments on 16 state-of-the-art manifold learning algorithms that embrace two important research questions in manifold learning-based fusion of hyperspectral and PolSAR data: (1) in which domain should the data be aligned—the data domain or the manifold domain; and (2) how to make use of existing labeled data when formulating a graph to represent a manifold—supervised, semi-supervised, or unsupervised. The performance of the algorithms were evaluated via multiple accuracy metrics of land use land cover classification over two data sets. Results show that the algorithms based on manifold alignment generally outperform those based on data alignment (data concatenation). Semi-supervised manifold alignment fusion algorithms performs the best among all. Experiments using multiple classifiers show that they outperform the benchmark data alignment-based algorithms by ca. 3% in terms of the overall classification accuracy.
Journal Article
A Robot-Driven 3D Shape Measurement System for Automatic Quality Inspection of Thermal Objects on a Forging Production Line
2018
The three-dimensional (3D) geometric evaluation of large thermal forging parts online is critical to quality control and energy conservation. However, this online 3D measurement task is extremely challenging for commercially available 3D sensors because of the enormous amount of heat radiation and complexity of the online environment. To this end, an automatic and accurate 3D shape measurement system integrated with a fringe projection-based 3D scanner and an industrial robot is presented. To resist thermal radiation, a double filter set and an intelligent temperature control loop are employed in the system. In addition, a time-division-multiplexing trigger is implemented in the system to accelerate pattern projection and capture, and an improved multi-frequency phase-shifting method is proposed to reduce the number of patterns required for 3D reconstruction. Thus, the 3D measurement efficiency is drastically improved and the exposure to the thermal environment is reduced. To perform data alignment in a complex online environment, a view integration method is used in the system to align non-overlapping 3D data from different views based on the repeatability of the robot motion. Meanwhile, a robust 3D registration algorithm is used to align 3D data accurately in the presence of irrelevant background data. These components and algorithms were evaluated by experiments. The system was deployed in a forging factory on a production line and performed a stable online 3D quality inspection for thermal axles.
Journal Article
Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training
2024
Artificial neural networks have emerged as computationally plausible models of human language processing. A major criticism of these models is that the amount of training data they receive far exceeds that of humans during language learning. Here, we use two complementary approaches to ask how the models’ ability to capture human fMRI responses to sentences is affected by the amount of training data. First, we evaluate GPT-2 models trained on 1 million, 10 million, 100 million, or 1 billion words against an fMRI benchmark. We consider the 100-million-word model to be developmentally plausible in terms of the amount of training data given that this amount is similar to what children are estimated to be exposed to during the first 10 years of life. Second, we test the performance of a GPT-2 model trained on a 9-billion-token dataset to reach state-of-the-art next-word prediction performance on the human benchmark at different stages during training. Across both approaches, we find that (i) the models trained on a developmentally plausible amount of data already achieve near-maximal performance in capturing fMRI responses to sentences. Further, (ii) lower perplexity—a measure of next-word prediction performance—is associated with stronger alignment with human data, suggesting that models that have received enough training to achieve sufficiently high next-word prediction performance also acquire representations of sentences that are predictive of human fMRI responses. In tandem, these findings establish that although some training is necessary for the models’ predictive ability, a developmentally realistic amount of training (∼100 million words) may suffice.
Journal Article
CMOT: Cross-Modality Optimal Transport for multimodal inference
by
Wang, Daifeng
,
Alatkar, Sayali Anil
in
Animal Genetics and Genomics
,
Bioinformatics
,
Biomedical and Life Sciences
2023
Multimodal measurements of single-cell sequencing technologies facilitate a comprehensive understanding of specific cellular and molecular mechanisms. However, simultaneous profiling of multiple modalities of single cells is challenging, and data integration remains elusive due to missing modalities and cell–cell correspondences. To address this, we developed a computational approach, Cross-Modality Optimal Transport (CMOT), which aligns cells within available multi-modal data (source) onto a common latent space and infers missing modalities for cells from another modality (target) of mapped source cells. CMOT outperforms existing methods in various applications from developing brain, cancers to immunology, and provides biological interpretations improving cell-type or cancer classifications.
Journal Article
Eris: efficiently measuring discord in multidimensional sources
2024
Data integration is a classical problem in databases, typically decomposed into schema matching, entity matching and data fusion. To solve the latter, it is mostly assumed that ground truth can be determined. However, in general, the data gathering processes in the different sources are imperfect and cannot provide an accurate merging of values. Thus, in the absence of ways to determine ground truth, it is important to at least quantify how far from being internally consistent a dataset is. Hence, we propose definitions of
concordant
data and define a
discordance
metric as a way of measuring disagreement to improve decision-making based on trustworthiness. We define the
discord measurement
problem of numerical attributes in which given a set of uncertain raw observations or aggregate results (such as case/hospitalization/death data relevant to COVID-19) and information on the alignment of different conceptualizations of the same reality (e.g., granularities or units), we wish to assess whether the different sources are concordant, or if not, use the discordance metric to quantify how discordant they are. We also define a set of algebraic operators to describe the alignments of different data sources with correctness guarantees, together with two alternative relational database implementations that reduce the problem to linear or quadratic programming. These are evaluated against both COVID-19 and synthetic data, and our experimental results show that discordance measurement can be performed efficiently in realistic situations.
Journal Article
A Reproducible FPGA–ADC Synchronization Architecture for High-Speed Data Acquisition
2026
High-speed data acquisition systems based on field-programmable gate arrays (FPGAs) often face synchronization challenges when interfacing with commercial analog-to-digital converters (ADCs), particularly under constrained hardware routing conditions and vendor-specific clocking assumptions. This work presents a vendor-independent FPGA–ADC synchronization architecture that enables reliable and repeatable high-speed data acquisition without relying on clock-capable input resources. Clock and frame signals are internally reconstructed and phase-aligned within the FPGA using mixed-mode clock management (MMCM) and input serializer/deserializer (ISERDES) resources, enabling time-sequential phase observation without the need for parallel snapshot or delay-line structures. Rather than targeting absolute metrological limits, the proposed approach emphasizes a reproducible and transparent data acquisition methodology applicable across heterogeneous FPGA–ADC platforms, in which clock synchronization is treated as a system-level design parameter affecting digital interface timing integrity and data reproducibility. Experimental validation using a custom Kintex-7 (XC7K325T) FPGA and an AFE7225 ADC demonstrates stable synchronization at sampling rates of up to 125 MS/s, with frequency-offset tolerance determined by the phase-tracking capability of the internal MMCM-based alignment loop. Consistent signal acquisition is achieved over the 100 kHz–20 MHz frequency range. The measured interface level timing uncertainty remains below 10 ps RMS, confirming robust clock and frame alignment. Meanwhile, the observed signal-to-noise ratio (SNR) performance, exceeding 80 dB, reflects the phase–noise-limited measurement quality of the system. The proposed architecture provides a cost-effective, scalable, and reproducible solution for experimental and research-oriented FPGA-based data acquisition systems operating under practical hardware constraints.
Journal Article
Improving Geological Remote Sensing Interpretation via Optimal Transport-Based Point–Surface Data Fusion
2024
High-quality geological remote sensing interpretation (GRSI) products play a vital role in a wide range of fields, including the military, meteorology, agriculture, the environment, mapping, etc. Due to the importance of GRSI products, this research aimed to improve their accuracy. Although deep-learning (DL)-based GRSI has reduced dependence on manual interpretation, the limited accuracy of multiple geological element interpretation still poses a challenge. This issue can be attributed to small inter-class differences, the uneven distribution of geological elements, sensor limitations, and the complexity of the environment. Therefore, this paper proposes a point–surface data optimal fusion method (PSDOF) to improve the accuracy of GRSI products based on optimal transport (OT) theory. PSDOF combines geological survey data (which has spatial location and geological element information called point data) with a geological remote sensing DL interpretation product (which has limited accuracy and is called surface data) to improve the quality of the resulting output. The method performs several steps to enhance accuracy. First, it calculates the gray-scale correlation feature information for the pixels adjacent to the geological survey points. Next, it determines the distribution of the feature information for geological elements in the vicinity of the point data. Finally, it incorporates complementary information from the survey points into the geological elements’ interpretation boundary, as well as calculates the optimal energy loss for point–surface fusion, thus resulting in an optimal boundary. The experiments conducted in this study demonstrated the superiority of the proposed model in addressing the problem of the limited accuracy of GRSI products.
Journal Article
A flexible statistical model for alignment of label-free proteomics data - incorporating ion mobility and product ion information
by
Soderblom, Erik J
,
Geromanos, Scott J
,
Moseley, M Arthur
in
Algorithms
,
Bioinformatics
,
Biological samples
2013
Background
The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing - the matching of peptide measurements across samples.
Results
We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates.
Conclusions
Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods.
Journal Article