Catalogue Search | MBRL

Designing for gesture and tangible interaction

by Maher, Mary Lou, author , Lee, Lina, author in Human-computer interaction. , Ambient intelligence. , Ubiquitous computing.

\"Interactive technology is increasingly integrated with physical objects that do not have a traditional keyboard and mouse style of interaction, and many do not even have a display. These objects require new approaches to interaction design, referred to as post-WIMP (Windows, Icons, Menus, and Pointer) or as embodied interaction design.

Book

Share this book

Add to My Shelf

Convolutional neural network for gesture recognition human-computer interaction system design

by Niu, Peixin in Accuracy , Algorithms , Analysis

2025

Gesture interaction applications have garnered significant attention from researchers in the field of human-computer interaction due to their inherent convenience and intuitiveness. Addressing the challenge posed by the insufficient feature extraction capability of existing network models, which hampers gesture recognition accuracy and increases model inference time, this paper introduces a novel gesture recognition algorithm based on an enhanced MobileNet network. This innovative design incorporates a multi-scale convolutional module to extract underlying features, thereby augmenting the network’s feature extraction capabilities. Moreover, the utilization of an exponential linear unit (ELU) activation function enhances the capture of comprehensive negative feature information. Empirical findings demonstrate that our approach surpasses the accuracy achieved by most lightweight network models on publicly available datasets, all while maintaining real-time gesture interaction capabilities. The accuracy of the proposed model in this paper attains 92.55% and 88.41% on the NUS-II and Creative Senz3D datasets, respectively, and achieves an impressive 98.26% on the ASL-M dataset.

Journal Article

Share this book

Add to My Shelf

Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video

by Dieleman, Sander , Pigou, Lionel , Mieke Van Herreweghe in Classification , Gesture recognition , Image processing systems

2018

Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results.

Journal Article

Share this book

Add to My Shelf

Real-Time Hand Gesture Monitoring Model Based on MediaPipe’s Registerable System

by Jiang, Haibo , Meng, Yuting , Wen, Haijun in Accuracy , Algorithms , Artificial intelligence

2024

Hand gesture recognition plays a significant role in human-to-human and human-to-machine interactions. Currently, most hand gesture detection methods rely on fixed hand gesture recognition. However, with the diversity and variability of hand gestures in daily life, this paper proposes a registerable hand gesture recognition approach based on Triple Loss. By learning the differences between different hand gestures, it can cluster them and identify newly added gestures. This paper constructs a registerable gesture dataset (RGDS) for training registerable hand gesture recognition models. Additionally, it proposes a normalization method for transforming hand gesture data and a FingerComb block for combining and extracting hand gesture data to enhance features and accelerate model convergence. It also improves ResNet and introduces FingerNet for registerable single-hand gesture recognition. The proposed model performs well on the RGDS dataset. The system is registerable, allowing users to flexibly register their own hand gestures for personalized gesture recognition.

Journal Article

Share this book

Add to My Shelf

Gesture recognition with Brownian reservoir computing using geometrically confined skyrmion dynamics

by Gerhards, Pascal , Beneke, Grischa , Knobloch, Klaus in 639/766/119/1001 , 639/925/927/1062 , 639/925/929/115

2024

Physical reservoir computing leverages the dynamical properties of complex physical systems to process information efficiently, significantly reducing training efforts and energy consumption. Magnetic skyrmions, topological spin textures, are promising candidates for reservoir computing systems due to their enhanced stability, non-linear interactions and low-power manipulation. Traditional spin-based reservoir computing has been limited to quasi-static detection or real-world data must be rescaled to the intrinsic timescale of the reservoir. We address this challenge by time-multiplexed skyrmion reservoir computing, that allows for aligning the reservoir’s intrinsic timescales to real-world temporal patterns. Using millisecond-scale hand gestures recorded with Range-Doppler radar, we feed voltage excitations directly into our device and detect the skyrmion trajectory evolution. This method scales down to the nanometer level and demonstrates competitive or superior performance compared to energy-intensive software-based neural networks. Our hardware approach’s key advantage is its ability to integrate sensor data in real-time without temporal rescaling, enabling numerous applications. Physical reservoir computing allows real-time low power information processing. Here, the authors report reservoir computing with magnetic skyrmions able to detect millisecond time-scale hand gestures, matching software neural networks’ performance.

Journal Article

Share this book

Add to My Shelf

Deep learning in vision-based static hand gesture recognition

by Oyedotun, Oyebade K. , Khashman, Adnan in Artificial Intelligence , Artificial neural networks , Computational Biology/Bioinformatics

2017

Hand gesture for communication has proven effective for humans, and active research is ongoing in replicating the same success in computer vision systems. Human–computer interaction can be significantly improved from advances in systems that are capable of recognizing different hand gestures. In contrast to many earlier works, which consider the recognition of significantly differentiable hand gestures, and therefore often selecting a few gestures from the American Sign Language (ASL) for recognition, we propose applying deep learning to the problem of hand gesture recognition for the whole 24 hand gestures obtained from the Thomas Moeslund’s gesture recognition database. We show that more biologically inspired and deep neural networks such as convolutional neural network and stacked denoising autoencoder are capable of learning the complex hand gesture classification task with lower error rates. The considered networks are trained and tested on data obtained from the above-mentioned public database; results comparison is then made against earlier works in which only small subsets of the ASL hand gestures are considered for recognition.

Journal Article

Share this book

Add to My Shelf

Real-time spatial normalization for dynamic gesture classification

by Zeghoud, Sofiane , Sheng, Bin , Ali, Saba Ghazanfar in Accuracy , Algorithms , Artificial Intelligence

2022

In this paper, we provide a new spatial data generalization method which we applied in hand gesture recognition tasks. Data gathering can be a tedious task when it comes to gesture recognition, especially dynamic gestures. Nowadays, the standard solutions when lacking data still consist of either the expensive gathering of new data or the impractical employment of hand-crafted data augmentation algorithms. While these solutions may show improvement, they come with disadvantages. We believe that a better extrapolation of the limited data’s common pattern, through an improved generalization, should first be considered. We, therefore, propose a dynamic generalization method that allows to capture and normalize in real-time the spatial evolution of the input. The latter procedure can be fully converted into a neural network processing layer which we call Evolution Normalization Layer . Experimental results on the SHREC2017 dataset showed that the addition of the proposed layer improved the prediction accuracy of a standard sequence-processing model while requiring 6 times fewer weights on average for a similar score. Furthermore, when trained on only 10% of the original training data, the standard model was able to reach a maximum accuracy of only 36.5% alone and 56.8% when applying a state-of-the-art processing method to the data, whereas the addition of our layer alone permitted to achieve a prediction accuracy of 81.5%.

Journal Article

Share this book

Add to My Shelf

Local Pyramid Vision Transformer: Millimeter-Wave Radar Gesture Recognition Based on Transformer with Integrated Local and Global Awareness

by Zhao, Shuo , Kang, Hailong , Hu, Guangxuan in Accuracy , Analysis , Artificial neural networks

2024

A millimeter-wave radar is widely accepted by the public due to its low susceptibility to interference, such as changes in light, and the protection of personal privacy. With the development of the deep learning theory, the deep learning method has been dominant in the millimeter-wave radar field, which usually uses convolutional neural networks for feature extraction. In recent years, transformer networks have also been highly valued by researchers due to their parallel processing capabilities and long-distance dependency modeling capabilities. However, traditional convolutional neural networks (CNNs) and vision transformers each have their limitations: CNNs usually overlook the global features of images and vision transformers may neglect local image continuity, and both of them may impede gesture recognition performance. In addition, whether CNN or transformer, their implementation is hindered by the scarcity of public radar gesture datasets. To address these limitations, this paper proposes a new recognition method using a local pyramid visual transformer (LPVT) based on millimeter-wave radar. LPVT can capture global and local features in dynamic gesture spectrograms, ultimately improving the recognition ability of gestures. In this paper, we mainly carried out the following two tasks: building the corresponding datasets and executing gesture recognition. First, we constructed a gesture dataset for training. In this stage, we use a 77 GHz radar to collect the echo signals of gestures and preprocess them to build a dataset. Second, we propose the LPVT network specifically designed for gesture recognition tasks. By integrating local sensing into the globally focused transformer, we improve its capacity to capture both global and local features in dynamic gesture spectrograms. The experimental results using the dataset we constructed show that the proposed LPVT network achieved a gesture recognition accuracy of 92.2%, which exceeds the performance of other networks.

Journal Article

Share this book

Add to My Shelf

Dyhand: dynamic hand gesture recognition using BiLSTM and soft attention methods

by Singh, Rohit Pratap , Singh, Laiphrakpam Dolendro in Accuracy , Algorithms , Artificial Intelligence

2025

Hand gesture recognition is an essential task in computer vision. It is the most intuitive and natural medium for communication when dealing with computers. Recently, with the advent of innovative technologies and high performing computer systems, there has been a surge in the research of Gesture Recognition. Traditional approaches to modelling skeletons are typically based on hand-crafted components or traversal algorithms, leading to limited expressive capacity and generalisation challenges. In this work, we present a novel dynamic skeleton model based on BiLSTM and soft attention named DyHand that mitigates the challenges of intra-class and inter-class variability of gesture classes to a great extent. The comparison of our model with state-of-the-art approaches on the two benchmark data sets with various data augmentation techniques is reported. The proposed approach yields the best results, achieving 97.14 and 96.42% recognition accuracy in the 14 and 28 gesture categories, respectively, for the DHG-14/28 data set and comparable recognition accuracy of 93.98% on 14 gesture classes and 87.86% on 28 gesture classes, respectively, in case of SHREC’17 data set.

Journal Article

Share this book

Add to My Shelf

A novel concatenate feature fusion RCNN architecture for sEMG-based hand gesture recognition

by Wang, Haipeng , Xu, Pufan , Li, Fei in Accuracy , Analysis , Artificial neural networks

2022

Hand gesture recognition tasks based on surface electromyography (sEMG) are vital in human-computer interaction, speech detection, robot control, and rehabilitation applications. However, existing models, whether traditional machine learnings (ML) or other state-of-the-arts, are limited in the number of movements. Targeting a large number of gesture classes, more data features such as temporal information should be persisted as much as possible. In the field of sEMG-based recognitions, the recurrent convolutional neural network (RCNN) is an advanced method due to the sequential characteristic of sEMG signals. However, the invariance of the pooling layer damages important temporal information. In the all convolutional neural network (ACNN), because of the feature-mixing convolution operation, a same output can be received from completely different inputs. This paper proposes a concatenate feature fusion (CFF) strategy and a novel concatenate feature fusion recurrent convolutional neural network (CFF-RCNN). In CFF-RCNN, a max-pooling layer and a 2-stride convolutional layer are concatenated together to replace the conventional simple dimensionality reduction layer. The featurewise pooling operation serves as a signal amplitude detector without using any parameter. The feature-mixing convolution operation calculates the contextual information. Complete evaluations are made on both the accuracy and convergence speed of the CFF-RCNN. Experiments are conducted using three sEMG benchmark databases named DB1, DB2 and DB4 from the NinaPro database. With more than 50 gestures, the classification accuracies of the CFF-RCNN are 88.87% on DB1, 99.51% on DB2, and 99.29% on DB4. These accuracies are the highest compared with reported accuracies of machine learnings and other state-of-the-art methods. To achieve accuracies of 86%, 99% and 98% for the RCNN, the training time are 2353.686 s, 816.173 s and 731.771 s, respectively. However, for the CFF-RCNN to reach the same accuracies, it needs only 1727.415 s, 542.245 s and 576.734 s, corresponding to a reduction of 26.61%, 33.56% and 21.19% in training time. We concluded that the CFF-RCNN is an improved method when classifying a large number of hand gestures. The CFF strategy significantly improved model performance with higher accuracy and faster convergence as compared to traditional RCNN.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter