Catalogue Search | MBRL

3-Partition Order-Preserving Pattern Matching

by Kang, Seokchul , Sim, Jeong Seop in Algorithms , approximate order-preserving pattern matching , Data loss

2026

Two strings of equal length are called order-isomorphic if their relative orders are identical at every position. The classical order-preserving pattern matching (OPPM) problem finds all substrings in a text T that are order-isomorphic to a pattern P. However, measurement errors can cause data loss or inaccuracies, making exact pattern detection difficult and motivating the active study of approximate OPPM variants, such as 2-partition OPPM. In this paper, we extend the existing partition-based relaxation of order-isomorphism and define the 3-partition OPPM problem. The 3-partition OPPM problem is to find all substrings in a text T that can be divided into three segments such that each partitioned segment is order-isomorphic to the corresponding segment of a pattern P. We propose an efficient algorithm to solve the problem in O(nm+m2logm) time, where n=|T| and m=|P|. We conduct experiments on various time series datasets, comparing the number of occurrences and the runtime efficiency among the OPPM, 2-partition OPPM, and proposed 3-partition OPPM algorithms. Our experimental evaluation shows that the proposed algorithm becomes increasingly cost-effective for longer patterns.

Journal Article

Share this book

Add to My Shelf

Order-Preserving Pattern Matching with Partition

by Kang, Seokchul , Kim, Youngjoon , Sim, Jeong Seop in Algorithms , approximate order-preserving pattern matching , Approximation

2024

Order-preserving pattern matching, which considers the relative orders of strings, can be applied to time-series data analysis. To perform a more meaningful analysis of time-series data, approximate criteria for the order-isomorphism are necessary, considering diverse types of errors. In this paper, we introduce a novel approximation criterion for the order-isomorphism, called the partitioned order-isomorphism. We then propose an efficient O(n+sort(m))-time algorithm for the order-preserving pattern matching problem considering the criterion of partition. A comparative experiment demonstrates that the proposed algorithm is more effective than the exact order-preserving pattern matching algorithm.

Journal Article

Share this book

Add to My Shelf

Multiple String Pattern Matching Algorithm Using Multi-Character Inverted Lists

by Khancome, Chouvalit in Algorithms , Bioinformatics , Complexity

2026

Multiple string matching is a fundamental operation in real-time analytics, cybersecurity, bioinformatics, and large-scale information retrieval. Nevertheless, existing approaches continue to face inherent trade-offs among preprocessing efficiency, verification overhead, and support for dynamic pattern updates, particularly in large and continuously evolving environments. This paper presents MMIVL, a high-performance algorithm founded on the multi-character inverted list (m-CIVL), a unified and inherently dynamic indexing framework for pattern management. By integrating positional information, termination semantics, and pattern associations within a single structure, m-CIVL enables direct matching without requiring a separate verification stage. MMIVL achieves a preprocessing complexity of O(|P|/s), a search complexity of O(|T| + nocc), and an update complexity of O(|p|/s), where s denotes the segment length. Extensive experiments on synthetic and real-world datasets demonstrate that MMIVL consistently outperforms representative baselines, with especially strong gains in large-scale scenarios, while maintaining stable performance and favorable memory efficiency. Overall, these results establish m-CIVL as an effective, scalable, and practically viable solution that unifies efficient preprocessing, high-throughput searching, and dynamic update capability for modern multiple string-matching applications.

Journal Article

Share this book

Add to My Shelf

Snowvision: Segmenting, Identifying, and Discovering Stamped Curve Patterns from Fragments of Pottery

by Zhang, Canyu , Zhou, Jun , McDorman, Sam T in Algorithms , Ceramics , Clustering

2022

In southeastern North America, Indigenous potters and woodworkers carved complex, primarily abstract, designs into wooden pottery paddles, which were subsequently used to thin the walls of hand-built, clay vessels. Original paddle designs carry rich historical and cultural information, but pottery paddles from ancient times have not survived. Archaeologists have studied design fragments stamped on sherds to reconstruct complete or nearly complete designs, which is extremely laborious and time-consuming. In Snowvision, we aim to develop computer vision methods to assist archaeologists to accomplish this goal more efficiently and effectively. For this purpose, we identify and study three computer vision tasks: (1) extracting curve structures stamped on pottery sherds; (2) matching sherds to known designs; (3) clustering sherds with unknown designs. Due to the noisy, highly fragmented, composite-curve patterns, each task poses unique challenges to existing methods. To solve them, we propose (1) a weakly-supervised CNN-based curve structure segmentation method that takes only curve skeleton labels to predict full curve masks; (2) a patch-based curve pattern matching method to address the problem of partial matching in terms of noisy binary images; (3) a curve pattern clustering method consisting of pairwise curve matching, graph partitioning and sherd stitching. We evaluate the proposed methods on a set of collected sherds and extensive experimental results show the effectiveness of the proposed algorithms.

Journal Article

Share this book

Add to My Shelf

Length-Bounded Hybrid CPU/GPU Pattern Matching Algorithm for Deep Packet Inspection

by Lin, Yi-Shan , Lee, Chun-Liang , Chen, Yaw-Chung in Algorithms , Central processing units , compute unified device architecture

2017

Since frequent communication between applications takes place in high speed networks, deep packet inspection (DPI) plays an important role in the network application awareness. The signature-based network intrusion detection system (NIDS) contains a DPI technique that examines the incoming packet payloads by employing a pattern matching algorithm that dominates the overall inspection performance. Existing studies focused on implementing efficient pattern matching algorithms by parallel programming on software platforms because of the advantages of lower cost and higher scalability. Either the central processing unit (CPU) or the graphic processing unit (GPU) were involved. Our studies focused on designing a pattern matching algorithm based on the cooperation between both CPU and GPU. In this paper, we present an enhanced design for our previous work, a length-bounded hybrid CPU/GPU pattern matching algorithm (LHPMA). In the preliminary experiment, the performance and comparison with the previous work are displayed, and the experimental results show that the LHPMA can achieve not only effective CPU/GPU cooperation but also higher throughput than the previous method.

Journal Article

Share this book

Add to My Shelf

NetDPO: (delta, gamma)-approximate pattern matching with gap constraints under one-off condition

by Guo, Lei , Liu, Jing , Wu, Youxi in Algorithms , Approximation , Computer science

2022

Approximate pattern matching not only is more general than exact pattern matching, but also allows some data noise. Most of them adopt the Hamming distance to measure similarity, which indicates the number of different characters in two sequences, but it cannot reflect the approximation between two characters. This paper addresses the approximate pattern matching with a local distance no larger than δ and a global distance no larger than γ, which is named Delta and gamma Pattern matching with gap constraints under One-off condition (DPO). First, we show that the problem is an NP-Hard problem. Therefore, we construct a heuristic algorithm named approximate Nettree for DPO (NetDPO), which transforms the problem into an approximate Nettree based on δ distance which is a specially designed data structure. Then, NetDPO calculates the number of paths that reach the roots within γ distance. To find the maximal occurrences, we employ the rightmost parent strategy and the optimal parent strategy to select the better occurrence which can minimize the influence after removing the occurrence. Iterate this process until there are no occurrences. Finally, we analyze the time and space complexities of NetDPO. Extensive experimental results verify the superiority of the proposed algorithm.

Journal Article

Share this book

Add to My Shelf

Pattern matching analysis: Overview of its rationale and application in qualitative research

by Vargas-Bianchi, Lizardo in Congruence , Literary criticism , Matching

2025

As qualitative research progresses, a precise understanding of the specific techniques that are still underdeveloped in literature is key. This discursive article provides a comprehensive overview of pattern matching analysis, a method that compares theoretical patterns derived from existing theories with empirical data to assess the agreement between theory and observed phenomena. Given the lack of consensus on the procedure for conducting this technique, this review addresses the gap in the qualitative literature. The significance of pattern matching analysis is emphasized because of its capacity to bridge theory and data, allowing refinement or qualitative testing of the theory. This article reviews and highlights the approaches and procedures used for pattern matching. It proposes meaning overlap as a guiding criterion for discerning the congruence between theoretical and empirical patterns. This study presents pattern matching as a versatile and formal method, emphasizing its potential for testing and refining theories in deductive qualitative research.

Journal Article

Share this book

Add to My Shelf

Pan-genome de Bruijn graph using the bidirectional FM-index

by Renders, Luca , Abeel, Thomas , Depuydt, Lore in Algorithms , Approximate pattern matching , Bioinformatics

2023

Background Pan-genome graphs are gaining importance in the field of bioinformatics as data structures to represent and jointly analyze multiple genomes. Compacted de Bruijn graphs are inherently suited for this purpose, as their graph topology naturally reveals similarity and divergence within the pan-genome. Most state-of-the-art pan-genome graphs are represented explicitly in terms of nodes and edges. Recently, an alternative, implicit graph representation was proposed that builds directly upon the unidirectional FM-index. As such, a memory-efficient graph data structure is obtained that inherits the FM-index’ backward search functionality. However, this representation suffers from a number of shortcomings in terms of functionality and algorithmic performance. Results We present a data structure for a pan-genome, compacted de Bruijn graph that aims to address these shortcomings. It is built on the bidirectional FM-index, extending the ability of its unidirectional counterpart to navigate and search the graph in both directions. All basic graph navigation steps can be performed in constant time. Based on these features, we implement subgraph visualization as well as lossless approximate pattern matching to the graph using search schemes. We demonstrate that we can retrieve all occurrences corresponding to a read within a certain edit distance in a very efficient manner. Through a case study, we show the potential of exploiting the information embedded in the graph’s topology through visualization and sequence alignment. Conclusions We propose a memory-efficient representation of the pan-genome graph that supports subgraph visualization and lossless approximate pattern matching of reads against the graph using search schemes. The C++ source code of our software, called Nexus, is available at https://github.com/biointec/nexus under AGPL-3.0 license.

Journal Article

Share this book

Add to My Shelf

A Logical Characterization for Approximate Matching of Pattern Graphs with Regular Expressions

by Zhang, Zuoli , Wang, Jin , Chen, Xuelei in Approximation , Data analysis , Graph matching

2025

A graph simulation and its variants are widely used in graph pattern matching. Among them, there have been related works involving the addition of regular expressions to graph patterns, which can discover more meaningful data and solve problems in polynomial time. In this research, which is based on Fan’s investigations, we first propose an approximation of graph simulation using the concept of metric and formal verification techniques, and then give the definition of approximate matching between pattern graphs with regular expressions and data graphs, which introduces a symmetric tolerance for errors, bridging exact and approximate matching. Finally, we present a logical characterization of the approximate graph simulation by extending Hennessy–Milner logic.

Journal Article

Share this book

Add to My Shelf

Maximum radial pattern matching for minimum star map identification

by Li, Qiang , Wei, Honggang , Fu, Jingneng in Algorithms , Binary stars , Error analysis

2024

This paper proposes an all-sky star map identification algorithm that can simultaneously achieve high identification probability, low algorithm complexity, and small databases for well photometric and intrinsic parameters-calibrated star sensors. The proposed algorithm includes three main steps. First, a binary radial pattern table is constructed offline. Then, the maximum value matching of the radial pattern is performed between the star spots and the guide stars, and the star pairs (i.e., the minimum star map) after radial pattern matching undergo a coarse matching through angular distance cross-validation. Finally, a reference star map is designed based on the identified star pairs, and the matching of all the star spots in the field of view is realized. Simulation and analysis results show that the database required by the proposed algorithm for 5,000 guide stars is not larger than 200 KB. Also, when false and missing star spots account for 50% of all guide stars and the star spot extraction error is 0.5 pixel (the corresponding pointing error is 26″), the average star map identification time of the proposed algorithm is less than 2 ms, and its identification probability is higher than 98%. The results demonstrate that the proposed algorithm performs better than similar algorithms.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter