Catalogue Search | MBRL

Sketchformer++: A Hierarchical Transformer Architecture for Vector Sketch Representation

by Ruan, Banhuai , Xu, Pengfei , Zheng, Youyi in hierarchy , neural representation , sketch recognition

2026

With the rising ubiquity of digital touch devices and sketch-based interfaces, freehand sketching has become an essential mode of visual communication. Nevertheless, interpreting these often ambiguous and sparse sketches poses challenges for computers. This paper presents Sketchformer++, a hierarchical transformer architecture for the neural representation of vector sketches. It treats a vector sketch as a three-level structure, at sketch level, stroke level, and segment level. Three self-attention modules are adopted in the network architecture, corresponding to the sketch hierarchy. The semantics of sketches are aggregated from local to global levels, resulting in neural representations of sketches. Extensive experiments show that Sketchformer++ helps to achieve superior performance in various downstream tasks, including sketch reconstruction, sketch recog-nition, sketch semantic segmentation, and sketch retrieval, demonstrating its robustness and effectiveness as a means of sketch representation. Code is available at https://github.com/BHR7/SketchformerPlus.

Journal Article

Share this book

Add to My Shelf

SKCompress: compressing sparse and nonuniform gradient in distributed machine learning

by Yang, Tong , Jiang, Jiawei , Shao, Yingxia in Algorithms , Buckets , Communication

2020

Distributed machine learning (ML) has been extensively studied to meet the explosive growth of training data. A wide range of machine learning models are trained by a family of first-order optimization algorithms, i.e., stochastic gradient descent (SGD). The core operation of SGD is the calculation of gradients. When executing SGD in a distributed environment, the workers need to exchange local gradients through the network. In order to reduce the communication cost, a category of quantification-based compression algorithms are used to transform the gradients to binary format, at the expense of a low precision loss. Although the existing approaches work fine for dense gradients, we find that these methods are ill-suited for many cases where the gradients are sparse and nonuniformly distributed. In this paper, we study is there a compression framework that can efficiently handle sparse and nonuniform gradients? We propose a general compression framework, called SKCompress, to compress both gradient values and gradient keys in sparse gradients. Our first contribution is a sketch-based method that compresses the gradient values. Sketch is a class of algorithm that approximates the distribution of a data stream with a probabilistic data structure. We first use a quantile sketch to generate splits, sort gradient values into buckets, and encode them with the bucket indexes. Our second contribution is a new sketch algorithm, namely MinMaxSketch, which compresses the bucket indexes. MinMaxSketch builds a set of hash tables and solves hash collisions with a MinMax strategy. Since the bucket indexes are nonuniform, we further adopt Huffman coding to compress MinMaxSketch. To compress the keys of sparse gradients, the third contribution of this paper is a delta-binary encoding method that calculates the increment of the gradient keys and encode them with binary format. An adaptive prefix is proposed to assign different sizes to different gradient keys, so that we can save more space. We also theoretically discuss the correctness and the error bound of our proposed methods. To the best of our knowledge, this is the first effort utilizing data sketch to compress gradients in ML. We implement a prototype system in a real cluster of our industrial partner Tencent Inc. and show that our method is up to 12× faster than the existing methods.

Journal Article

Share this book

Add to My Shelf

A simplified and novel technique to retrieve color images from hand-drawn sketch by human

by Narasimha Murthy, Pavithra , Yeliyur Hanumanthaiah, Sharath Kumar in Accuracy , Cluster analysis , Clustering

2022

With the increasing adoption of human-computer interaction, there is a growing trend of extracting the image through hand-drawn sketches by humans to find out correlated objects from the storage unit. A review of the existing system shows the dominant use of sophisticated and complex mechanisms where the focus is more on accuracy and less on system efficiency. Hence, this proposed system introduces a simplified extraction of the related image using an attribution clustering process and a cost-effective training scheme. The proposed method uses K-means clustering and bag-of-attributes to extract essential information from the sketch. The proposed system also introduces a unique indexing scheme that makes the retrieval process faster and results in retrieving the highest-ranked images. Implemented in MATLAB, the study outcome shows the proposed system offers better accuracy and processing time than the existing feature extraction technique.

Journal Article

Share this book

Add to My Shelf

Machine Learning Models for Artist Classification of Cultural Heritage Sketches

by Pop, Matei , Rădvan, Roxana , Mușat, Silviu in Algorithms , Artificial intelligence , Artists

2025

Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging task with three machine learning algorithms and evaluate their performance on a small collection of images from five distinct artists. These algorithms aim to find the most appropriate artist for a sketch (or a contour of a sketch), with promising results that have a higher level of confidence (around 92%). Models start from common Faster R-CNN architectures, reinforcement learning, and vector extraction tools. The proposed tool provides a base for future improvements to create a tool that aids artwork evaluators.

Journal Article

Share this book

Add to My Shelf

A complete hand-drawn sketch vectorization framework

by Cesano, Simone , Prati, Andrea , Donati, Luca in Algorithms , Computer graphics , Cross correlation

2019

Vectorizing hand-drawn sketches is an important but challenging task. Many businesses rely on fashion, mechanical or structural designs which, sooner or later, need to be converted in vectorial form. For most, this is still a task done manually. This paper proposes a complete framework that automatically transforms noisy and complex hand-drawn sketches with different stroke types in a precise, reliable and highly-simplified vectorized model. The proposed framework includes a novel line extraction algorithm based on a multi-resolution application of Pearson’s cross correlation and a new unbiased thinning algorithm that can get rid of scribbles and variable-width strokes to obtain clean 1-pixel lines. Other contributions include variants of pruning, merging and edge linking procedures to post-process the obtained paths. Finally, a modification of the original Schneider’s vectorization algorithm is designed to obtain fewer control points in the resulting Bézier splines. All the steps presented in this framework have been extensively tested and compared with state-of-the-art algorithms, showing (both qualitatively and quantitatively) their outperformance. Moreover they exhibit fast real-time performance, making them suitable for integration in any computer graphics toolset.

Journal Article

Share this book

Add to My Shelf

Sketch-Based Empirical Natural Gradient Methods for Deep Learning

by Chen, Mengyun , Yang, Minghan , Wen, Zaiwen in Algorithms , Approximation , Artificial neural networks

2022

In this paper, we develop an efficient sketch-based empirical natural gradient method (SENG) for large-scale deep learning problems. The empirical Fisher information matrix is usually low-rank since the sampling is only practical on a small amount of data at each iteration. Although the corresponding natural gradient direction lies in a small subspace, both the computational cost and memory requirement are still not tractable due to the high dimensionality. We design randomized techniques for different neural network structures to resolve these challenges. For layers with a reasonable dimension, sketching can be performed on a regularized least squares subproblem. Otherwise, since the gradient is a vectorization of the product between two matrices, we apply sketching on the low-rank approximations of these matrices to compute the most expensive parts. A distributed version of SENG is also developed for extremely large-scale applications. Global convergence to stationary points is established under mild assumptions and a fast linear convergence is analyzed under the neural tangent kernel (NTK) case. Extensive experiments on convolutional neural networks show the competitiveness of SENG compared with the state-of-the-art methods. On the task ResNet50 with ImageNet-1k, SENG achieves 75.9% Top-1 testing accuracy within 41 epochs. Experiments on the distributed large-batch training Resnet50 with ImageNet-1k show that the scaling efficiency is quite reasonable.

Journal Article

Share this book

Add to My Shelf

Sketch recognition using transfer learning

by Sert, Mustafa , Boyacı, Emel in Accuracy , Artificial neural networks , Automation

2019

Humans have an excellent ability to recognize freehand sketch drawings despite their abstract and sparse structures. Understanding freehand sketches with automated methods is a challenging task due to the diversity and abstract structures of these sketches. In this paper, we propose an efficient freehand sketch recognition scheme, which is based on the feature-level fusion of Convolutional Neural Networks (CNNs) in the transfer learning context. Specifically, we analyse different layer performances of distinct ImageNet pretrained CNNs and combine best performing layer features within the CNN-SVM pipeline for recognition. We also employ Principal Component Analysis (PCA) to reduce the fused deep feature dimensions to ensure the efficiency of the recognition application on the limited-capacity devices. We perform evaluations on two real sketch benchmark datasets, namely the Sketchy and the TU-Berlin to show the effectiveness of the proposed scheme. Our experimental results show that, the feature-level fusion scheme with the PCA achieves a recognition accuracy of 97.91% and 72.5% on the Sketchy and TU-Berlin datasets, respectively. This result is promising when compared with the human recognition accuracy of 73.1% on the TU-Berlin dataset. We also develop a sketch recognition application for smart devices to demonstrate the proposed scheme.

Journal Article

Share this book

Add to My Shelf

A hierarchical residual network with compact triplet-center loss for sketch recognition

by Wang, Lei , Zhang, Shihui , Sang, Yu in Accuracy , Computer Communication Networks , Computer Science

2022

With the widespread use of touch-screen devices, it is more and more convenient for people to draw sketches on screen. This results in the demand for automatically understanding the sketches. Thus, the sketch recognition task becomes more significant than before. To accomplish this task, it is necessary to solve the critical issue of improving the distinction of the sketch features. To this end, we have made efforts in three aspects. First, a novel multi-scale residual block is designed. Compared with the conventional basic residual block, it can better perceive multi-scale information and reduce the number of parameters during training. Second, a hierarchical residual structure is built by stacking multi-scale residual blocks in a specific way. In contrast with the single-level residual structure, the learned features from this structure are more sufficient. Last but not least, the compact triplet-center loss is proposed specifically for the sketch recognition task. It can solve the problem that the triplet-center loss does not fully consider too large intra-class space and too small inter-class space in sketch field. By studying the above modules, a hierarchical residual network as a whole is proposed for sketch recognition and evaluated on Tu-Berlin benchmark thoroughly. The experimental results show that the proposed network outperforms most of baseline methods and it is excellent among non-sequential models at present.

Journal Article

Share this book

Add to My Shelf

SketchFormer: transformer-based approach for sketch recognition using vector images

by Chopra, Shivang , Chopra, Suransh , Parihar, Anil Singh in Computer Communication Networks , Computer Science , Data Structures and Information Theory

2021

Sketches have been employed since the ancient era of cave paintings for simple illustrations to represent real-world entities and communication. The abstract nature and varied artistic styling make automatic recognition of these drawings more challenging than other areas of image classification. Moreover, the representation of sketches as a sequence of strokes instead of raster images introduces them at the correct abstract level. However, dealing with images as a sequence of small information makes it challenging. In this paper, we propose a Transformer-based network, dubbed as AttentiveNet, for sketch recognition. This architecture incorporates ordinal information to perform the classification task in real-time through vector images. We employ the proposed model to isolate the discriminating strokes of each doodle using the attention mechanism of Transformers and perform an in-depth qualitative analysis of the isolated strokes for classification of the sketch. Experimental evaluation validates that the proposed network performs favorably against state-of-the-art techniques.

Journal Article

Share this book

Add to My Shelf

Audio splicing detection and localization using multistage filterbank spectral sketches and decision fusion

by Su, Zhaopin , Lian, Chensi , Zhang, Guofu in Acoustics , Background noise , Computer Communication Networks

2024

Heterogeneous audio splicing tampering, which combines audio recordings from different scenarios or devices, has posed a significant challenge to audio authenticity. Most of the existing work is not good at the detection and localization for multiple splicing points, especially when the signal-to-noise ratios (SNRs) of recordings involved in splicing are close. In this work, we propose an audio splicing detection and localization method on the basis of multistage filterbank spectral sketches (MFBSS) and decision fusion. More specifically, we first remove the silent segments to reduce the redundant information and estimate the background noise of the combined voice-only segments. Next, to obtain more audio details, we propose a feature fusion strategy to extract the MFBSS feature from the background noise. Then, we develop a decision fusion strategy to detect and localize all the possible splicing points. Finally, we evaluate our method against the state-of-the-art splicing detection approaches on public datasets with various noises and SNR differences. Experimental results demonstrate that the proposed approach is effective for various noise scenarios with small SNR differences and is also robust against anti-forensics attacks.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter