Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
69
result(s) for
"human skeleton extraction"
Sort by:
Constraint-Based Optimized Human Skeleton Extraction from Single-Depth Camera
2019
As a cutting-edge research topic in computer vision and graphics for decades, human skeleton extraction from single-depth camera remains challenging due to possibly occurring occlusions of different body parts, huge appearance variations, and sensor noise. In this paper, we propose to incorporate human skeleton length conservation and symmetry priors as well as temporal constraints to enhance the consistency and continuity for the estimated skeleton of a moving human body. Given an initial estimation of the skeleton joint positions provided per frame by the Kinect SDK or Nuitrack SDK, which do not follow the aforementioned priors and can prone to errors, our framework improves the accuracy of these pose estimates based on the length and symmetry constraints. In addition, our method is device-independent and can be integrated into skeleton extraction SDKs for refinement, allowing the detection of outliers within the initial joint location estimates and predicting new joint location estimates following the temporal observations. The experimental results demonstrate the effectiveness and robustness of our approach in several cases.
Journal Article
Tracking-based 3D human skeleton extraction from stereo video camera toward an on-site safety and ergonomic analysis
2016
Purpose
As a means of data acquisition for the situation awareness, computer vision-based motion capture technologies have increased the potential to observe and assess manual activities for the prevention of accidents and injuries in construction. This study thus aims to present a computationally efficient and robust method of human motion data capture for the on-site motion sensing and analysis.
Design/methodology/approach
This study investigated a tracking approach to three-dimensional (3D) human skeleton extraction from stereo video streams. Instead of detecting body joints on each image, the proposed method tracks locations of the body joints over all the successive frames by learning from the initialized body posture. The corresponding body joints to the ones tracked are then identified and matched on the image sequences from the other lens and reconstructed in a 3D space through triangulation to build 3D skeleton models. For validation, a lab test is conducted to evaluate the accuracy and working ranges of the proposed method, respectively.
Findings
Results of the test reveal that the tracking approach produces accurate outcomes at a distance, with nearly real-time computational processing, and can be potentially used for site data collection. Thus, the proposed approach has a potential for various field analyses for construction workers’ safety and ergonomics.
Originality/value
Recently, motion capture technologies have rapidly been developed and studied in construction. However, existing sensing technologies are not yet readily applicable to construction environments. This study explores two smartphones as stereo cameras as a potentially suitable means of data collection in construction for the less operational constrains (e.g. no on-body sensor required, less sensitivity to sunlight, and flexible ranges of operations).
Journal Article
A Review: Point Cloud-Based 3D Human Joints Estimation
2021
Joint estimation of the human body is suitable for many fields such as human–computer interaction, autonomous driving, video analysis and virtual reality. Although many depth-based researches have been classified and generalized in previous review or survey papers, the point cloud-based pose estimation of human body is still difficult due to the disorder and rotation invariance of the point cloud. In this review, we summarize the recent development on the point cloud-based pose estimation of the human body. The existing works are divided into three categories based on their working principles, including template-based method, feature-based method and machine learning-based method. Especially, the significant works are highlighted with a detailed introduction to analyze their characteristics and limitations. The widely used datasets in the field are summarized, and quantitative comparisons are provided for the representative methods. Moreover, this review helps further understand the pertinent applications in many frontier research directions. Finally, we conclude the challenges involved and problems to be solved in future researches.
Journal Article
Fall Detection Based on Key Points of Human-Skeleton Using OpenPose
2020
According to statistics, falls are the primary cause of injury or death for the elderly over 65 years old. About 30% of the elderly over 65 years old fall every year. Along with the increase in the elderly fall accidents each year, it is urgent to find a fast and effective fall detection method to help the elderly fall.The reason for falling is that the center of gravity of the human body is not stable or symmetry breaking, and the body cannot keep balance. To solve the above problem, in this paper, we propose an approach for reorganization of accidental falls based on the symmetry principle. We extract the skeleton information of the human body by OpenPose and identify the fall through three critical parameters: speed of descent at the center of the hip joint, the human body centerline angle with the ground, and width-to-height ratio of the human body external rectangular. Unlike previous studies that have just investigated falling behavior, we consider the standing up of people after falls. This method has 97% success rate to recognize the fall down behavior.
Journal Article
Training confounder-free deep learning models for medical applications
2020
The presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at
https://github.com/qingyuzhao/br-net
.
The presence of confounding effects is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Here, the authors introduce an end-to-end approach for deriving features invariant to confounding factors as inputs to prediction models.
Journal Article
A Human Activity Recognition System Using Skeleton Data from RGBD Sensors
by
Gambi, Ennio
,
Spinsante, Susanna
,
Gasparrini, Samuele
in
Accelerometry - methods
,
Aged
,
Algorithms
2016
The aim of Active and Assisted Living is to develop tools to promote the ageing in place of elderly people, and human activity recognition algorithms can help to monitor aged people in home environments. Different types of sensors can be used to address this task and the RGBD sensors, especially the ones used for gaming, are cost-effective and provide much information about the environment. This work aims to propose an activity recognition algorithm exploiting skeleton data extracted by RGBD sensors. The system is based on the extraction of key poses to compose a feature vector, and a multiclass Support Vector Machine to perform classification. Computation and association of key poses are carried out using a clustering algorithm, without the need of a learning algorithm. The proposed approach is evaluated on five publicly available datasets for activity recognition, showing promising results especially when applied for the recognition of AAL related actions. Finally, the current applicability of this solution in AAL scenarios and the future improvements needed are discussed.
Journal Article
Towards a deep human activity recognition approach based on video to image transformation with skeleton data
2021
One of the most recent challenging tasks in computer vision is Human Activity Recognition (HAR), which aims to analyze and detect the human actions for the benefit of many fields such as video surveillance, behavior analysis and healthcare. Several works in the literature are based on the extraction and analysis of human skeletons in the aim of actions recognition. This paper introduces a new HAR approach based on the extraction of human skeletons from videos. Three features extraction techniques are proposed in this work. They used the extracted skeletons from the videos frames in order to construct a single image that summarizes the activity in that video. The first technique, called dynamic skeleton, is founded on the concept of dynamic images introduced in the literature, while the second one, called skeleton superposition, is based on the superposition of the extracted human skeletons in the same image. The third contribution is called body articulations and it uses only the body joints instead of the whole skeleton in order to recognize the ongoing activity. The obtained images from these three techniques are analyzed and classified using a classification system based on transfer learning principle by fine-tuning three well-known pre-trained CNNs (MobileNet, ResNet-50, VGG16). The designed system is validated and tested on two famous datasets for human activity recognition, which are RGBD-HuDact and KTH datasets. The obtained results are outstanding and proved that the implemented system outperforms the state-of-the-art approaches.
Journal Article
Application of machine-learning methods in age-at-death estimation from 3D surface scans of the adult acetabulum
by
Velemínská, Jana
,
Techataweewan, Nawaporn
,
Bejdová, Šárka
in
Acetabulum
,
Acetabulum - diagnostic imaging
,
Adult
2024
Age-at-death estimation is usually done manually by experts. As such, manual estimation is subjective and greatly depends on the past experience and proficiency of the expert. This becomes even more critical if experts need to evaluate individuals with unknown population affinity or with affinity that they are not familiar with. The purpose of this study is to design a novel age-at-death estimation method allowing for automatic evaluation on computers, thus eliminating the human factor.
We used a traditional machine-learning approach with explicit feature extraction. First, we identified and described the features that are relevant for age-at-death estimation. Then, we created a multi-linear regression model combining these features. Finally, we analysed the model performance in terms of Mean Absolute Error (MAE), Mean Bias Error (MBE), Slope of Residuals (SoR) and Root Mean Squared Error (RMSE).
The main result of this study is a population-independent method of estimating an individual's age-at-death using the acetabulum of the pelvis. Apart from data acquisition, the whole procedure of pre-processing, feature extraction and age estimation is fully automated and implemented as a computer program. This program is a part of a freely available web-based software tool called CoxAGE3D, which is available at https://coxage3d.fit.cvut.cz/. Based on our dataset, the MAE of the presented method is about 10.7 years. In addition, five population-specific models for Thai, Lithuanian, Portuguese, Greek and Swiss populations are also given. The MAEs for these populations are 9.6, 9.8, 10.8, 10.5 and 9.2 years, respectively. Our age-at-death estimation method is suitable for individuals with unknown population affinity and provides acceptable accuracy. The age estimation error cannot be completely eliminated, because it is a consequence of the variability of the ageing process of different individuals not only across different populations but also within a certain population.
[Display omitted]
•New automatic age estimation method based on pelvis acetabulum 3D scans.•Multi-population model ensures robustness across different population affinities.•MAE of 10.7 years, comparable to state-of-the-art population-specific methods.•Automated feature extraction eliminates need for manual skeletal evaluation.
Journal Article
Enhancing human behavior recognition with spatiotemporal graph convolutional neural networks and skeleton sequences
by
Wang, Qinghui
,
Zeng, Wei
,
Zou, Ruirui
in
Algorithms
,
Artificial neural networks
,
Coordinate transformations
2024
ObjectivesThis study aims to enhance supervised human activity recognition based on spatiotemporal graph convolutional neural networks by addressing two key challenges: (1) extracting local spatial feature information from implicit joint connections that is unobtainable through standard graph convolutions on natural joint connections alone. (2) Capturing long-range temporal dependencies that extend beyond the limited temporal receptive fields of conventional temporal convolutions.MethodsTo achieve these objectives, we propose three novel modules integrated into the spatiotemporal graph convolutional framework: (1) a connectivity feature extraction module that employs attention to model implicit joint connections and extract their local spatial features. (2) A long-range frame difference feature extraction module that captures extensive temporal context by considering larger frame intervals. (3) A coordinate transformation module that enhances spatial representation by fusing Cartesian and spherical coordinate systems.FindingsEvaluation across multiple datasets demonstrates that the proposed method achieves significant improvements over baseline networks, with the highest accuracy gains of 2.76% on the NTU-RGB+D 60 dataset (Cross-subject), 4.1% on NTU-RGB+D 120 (Cross-subject), and 4.3% on Kinetics (Top-1), outperforming current state-of-the-art algorithms. This paper delves into the realm of behavior recognition technology, a cornerstone of autonomous systems, and presents a novel approach that enhances the accuracy and precision of human activity recognition.
Journal Article
3DMesh-GAR: 3D Human Body Mesh-Based Method for Group Activity Recognition
by
Saqlain, Muhammad
,
Lee, Seongyeong
,
Lee, Changhwa
in
3D human activity recognition
,
Computational linguistics
,
Datasets
2022
Group activity recognition is a prime research topic in video understanding and has many practical applications, such as crowd behavior monitoring, video surveillance, etc. To understand the multi-person/group action, the model should not only identify the individual person’s action in the context but also describe their collective activity. A lot of previous works adopt skeleton-based approaches with graph convolutional networks for group activity recognition. However, these approaches are subject to limitation in scalability, robustness, and interoperability. In this paper, we propose 3DMesh-GAR, a novel approach to 3D human body Mesh-based Group Activity Recognition, which relies on a body center heatmap, camera map, and mesh parameter map instead of the complex and noisy 3D skeleton of each person of the input frames. We adopt a 3D mesh creation method, which is conceptually simple, single-stage, and bounding box free, and is able to handle highly occluded and multi-person scenes without any additional computational cost. We implement 3DMesh-GAR on a standard group activity dataset: the Collective Activity Dataset, and achieve state-of-the-art performance for group activity recognition.
Journal Article