Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,018
result(s) for
"human pose estimation"
Sort by:
A systematic survey on human pose estimation: upstream and downstream tasks, approaches, lightweight models, and prospects
2025
In recent years, human pose estimation has been widely studied as a branch task of computer vision. Human pose estimation plays an important role in the development of medicine, fitness, virtual reality, and other fields. Early human pose estimation technology used traditional manual modeling methods. Recently, human pose estimation technology has developed rapidly using deep learning. This study not only reviews the basic research of human pose estimation but also summarizes the latest cutting-edge technologies. In addition to systematically summarizing the human pose estimation technology, this article also extends to the upstream and downstream tasks of human pose estimation, which shows the positioning of human pose estimation technology more intuitively. In particular, considering the issues regarding computer resources and challenges concerning model performance faced by human pose estimation, the lightweight human pose estimation models and the transformer-based human pose estimation models are summarized in this paper. In general, this article classifies human pose estimation technology around types of methods, 2D or 3D representation of outputs, the number of people, views, and temporal information. Meanwhile, classic datasets and targeted datasets are mentioned in this paper, as well as metrics applied to these datasets. Finally, we generalize the current challenges and possible development of human pose estimation technology in the future.
Journal Article
Performance benchmark of deep learning human pose estimation for UAVs
by
Chatzis, Vassilios
,
Kalampokas, Theofanis
,
Krinidis, Stelios
in
Algorithms
,
Benchmarks
,
Communications Engineering
2023
Human pose estimation (HPE) is a computer vision application that estimates human body joints from images. It gives machines the capability to better understand the interaction between humans and the environment. For this accomplishment, many HPE methods have been deployed in robots, vehicles, and unmanned aerial vehicles (UAVs). This effort raised the challenge of balance between algorithm performance and efficiency, especially in UAVs, where computational resources are limited for saving battery power. Despite the considerable progress in the HPE problem, there are very few methods that are proposed to face this challenge. To highlight the severity of this fact, the proposed paper presents a brief review and an HPE benchmark from the aspect of algorithms performance and efficiency under UAV operation. More specifically, the contribution of HPE methods in the last 22 years is covered, along with the variety of methods that exist. The benchmark consists of 36 pose estimation models in 3 known datasets with metrics that fulfill the paper aspect. From the results, MobileNet-based models achieved competitive performance and the lowest computational cost, in comparison with ResNet-based models. Finally, benchmark results are projected in edge devices hardware specifications to analyze the appropriateness of these algorithms for UAV deployment.
Journal Article
Human pose estimation based on lightweight basicblock
by
Liu, Ruyi
,
Li, Yanping
,
Wang, Xiangyang
in
Communications Engineering
,
Computer Science
,
Datasets
2023
Human pose estimation based on deep learning have attracted increasing attention in the past few years and have shown superior performance on various datasets. Many researchers have increased the number of network layers to improve the accuracy of the model. However, with the deepening of the number of network layers, the parameters and computation of the model are also increasing, which makes the model unable to be deployed on edge devices and mobile terminals with limited computing power, and also makes many intelligent terminals limited in volume, power consumption and storage. Inspired by the lightweight method, we propose a human pose estimation model based on the lightweight network to solve those problems, which designs the lightweight basic block module by using the deep separable convolution and the reverse bottleneck layer to accelerate the network calculation and reduce the parameters of the overall network model. Experiments on COCO dataset and MPII dataset prove that this lightweight basicblock module can effectively reduce the amount of parameters and computation of human pose estimation model.
Journal Article
Position Puzzle Network and Augmentation: localizing human keypoints beyond the bounding box
2023
When estimating human pose with a partial image of a person, we, humans, do not confine the spatial range of our estimation to the given image and can readily localize keypoints outside of the image by referring to visual clues such as the body size. However, computational methods for human pose estimation do not consider those keypoints outside and focus only on the bounded area of a given image. In this paper, we propose a neural network and a data augmentation method to extend the range of human pose estimation beyond the bounding box. While our Position Puzzle Network expands the spatial range of keypoint localization by refining the position and the size of the target’s bounding box, Position Puzzle Augmentation enables the keypoint detector to estimate keypoints not only within, but also beyond the input image. We show that the proposed method enhances the baseline keypoint detectors by 39.5% and 30.5% on average in mAP and mAR, respectively, by enabling the localization of keypoints out of the bounding box using a cropped image dataset prepared for proper evaluation. Additionally, we verify that the proposed method does not degrade the performance under the original benchmarks and instead, improves the performance by alleviating false-positive errors.
Journal Article
Unsupervised universal hierarchical multi-person 3D pose estimation for natural scenes
by
Wang, Gaoang
,
Hwang, Jenq-Neng
,
McQuade, Kevin
in
Cameras
,
Computer Communication Networks
,
Computer Science
2022
Multi-person 3D pose estimation using a monocular freely moving camera in real-world scenarios remains a challenge. There is a lack of data with 3D ground truth, and real-world scenes usually contain self-occlusions and inter-person occlusions. To address these challenges, an unsupervised Universal Hierarchical 3D Human Pose Estimation (UH3DHPE) method that optimizes the torso and limb poses based on a hierarchical framework is proposed. To handle the case of an occluded or inaccurate 2D torso keypoints, which play an important role for 3D pose initialization and subsequent inference, an effective method to directly estimate limb poses without building upon the estimated torso pose is proposed, and the torso pose can then be further refined to form the hierarchy in a bottom-up fashion. An adaptive merging strategy is proposed to determine the best hierarchy. To verify the effectiveness of the proposed scheme, a video dataset for multi-person interactions is collected by a moving camera, under a Motion Capture (MoCap) ground truth data acquisition environment, for our performance evaluations. Experimental results show the proposed method outperforms state-of-the-art methods on the multi-person moving camera scenarios.
Journal Article
Wide-baseline multi-camera calibration from a room filled with people
by
Abedrabbo, G.
,
Domken, C.
,
Bey-Temsamani, A.
in
Calibration
,
Cameras
,
Communications Engineering
2023
When a precise 3D reconstruction of an object or person is attempted, one typically starts from a multi-view setup with cameras spread out all around the investigation area. A triangulation of the matching joints is then performed to retrieve the 3D coordinates. However, calibrating such a setup typically requires dedicated equipment and elaborated test procedures. In this paper, we will demonstrate a calibration method based only on the detection of one or more people walking through the field of view. This, in effect, allows the calibration to happen simultaneously with the measurements being taken, which is practical when dealing with uncontrolled environments. We will also show that this calibration procedure is more accurate than a typical incremental calibration procedure using a chessboard. Conceptually, the novelty that we propose is in using semantic information (e.g. the position of the left shoulder) rather than appearance-based information to drive the calibration, as this type of information is less viewpoint dependent. Note that here we use human pose keypoints but for larger outdoor scenes, car keypoints could be used as well.
Journal Article
Dense depth alignment for human pose and shape estimation
2024
Estimating 3D human pose and shape (HPS) from a monocular image has many applications. However, collecting ground-truth data for this problem is costly and constrained to limited lab environments. Researchers have used priors based on body structure or kinematics, cues obtained from other vision tasks to mitigate the scarcity of supervision. Despite its apparent potential in this context, monocular depth estimation has yet to be explored. In this paper, we propose the Dense Depth Alignment (DDA) method, where we use an estimated dense depth map to create an auxiliary supervision signal for 3D HPS estimation. Specifically, we define a dense mapping between the points on the surface of the human mesh and the points reconstructed from depth estimation. We further introduce the idea of Camera Pretraining, a novel learning strategy where, instead of estimating all parameters simultaneously, learning of camera parameters is prioritized (before pose and shape parameters) to avoid unwanted local minima. Our experiments on Human3.6M and 3DPW datasets show that our DDA loss and Camera Pretraining significantly improve HPS estimation performance over using only 2D keypoint supervision or 2D and 3D supervision. Code will be provided for research purposes in the following URL:
https://terteros.github.io/hmr-depth/
.
Journal Article
A comprehensive survey on human pose estimation approaches
2023
The human pose estimation is a significant issue that has been taken into consideration in the computer vision network for recent decades. It is a vital advance toward understanding individuals in videos and still images. In simple terms, a human pose estimation model takes in an image or video and estimates the position of a person’s skeletal joints in either 2D or 3D space. Several studies on human posture estimation can be found in the literature, however, they center around a specific class; for instance, model-based methodologies or human movement investigation, and so on. Later, various Deep Learning (DL) algorithms came into existence to overcome the difficulties which were there in the earlier approaches. In this study, an exhaustive review of human pose estimation (HPE), including milestone work and recent advancements is carried out. This survey discusses the different two-dimensional (2D) and three-dimensional human (3D) pose estimation techniques along with their classical and deep learning approaches which provide the solution to the various computer vision problems. Moreover, the paper also considers the different deep learning models used in pose estimation, and the analysis of 2D and 3D datasets is done. Some of the evaluation metrics used for estimating human poses are also discussed here. By knowing the direction of the individuals, HPE opens a road for a few real-life applications some of which are talked about in this study.
Journal Article
AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild
2021
Occlusion is probably the biggest challenge for human pose estimation in the wild. Typical solutions often rely on intrusive sensors such as IMUs to detect occluded joints. To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views. The core of AdaFuse is to determine the point-point correspondence between two views which we solve effectively by exploring the sparsity of the heatmap representation. We also learn an adaptive fusion weight for each camera view to reflect its feature quality in order to reduce the chance that good features are undesirably corrupted by “bad” views. The fusion model is trained end-to-end with the pose estimation network, and can be directly applied to new camera configurations without additional adaptation. We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic. It outperforms the state-of-the-arts on all of them. We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints, as it provides occlusion labels for every joint in the images. The dataset and code are released at https://github.com/zhezh/adafuse-3d-human-pose.
Journal Article
Human Pose Estimation Using MediaPipe Pose and Optimization Method Based on a Humanoid Model
2023
Seniors who live alone at home are at risk of falling and injuring themselves and, thus, may need a mobile robot that monitors and recognizes their poses automatically. Even though deep learning methods are actively evolving in this area, they have limitations in estimating poses that are absent or rare in training datasets. For a lightweight approach, an off-the-shelf 2D pose estimation method, a more sophisticated humanoid model, and a fast optimization method are combined to estimate joint angles for 3D pose estimation. As a novel idea, the depth ambiguity problem of 3D pose estimation is solved by adding a loss function deviation of the center of mass from the center of the supporting feet and penalty functions concerning appropriate joint angle rotation range. To verify the proposed pose estimation method, six daily poses were estimated with a mean joint coordinate difference of 0.097 m and an average angle difference per joint of 10.017 degrees. In addition, to confirm practicality, videos of exercise activities and a scene of a person falling were filmed, and the joint angle trajectories were produced as the 3D estimation results. The optimized execution time per frame was measured at 0.033 s on a single-board computer (SBC) without GPU, showing the feasibility of the proposed method as a real-time system.
Journal Article