Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
21
result(s) for
"Cazzato, Dario"
Sort by:
When I Look into Your Eyes: A Survey on Computer Vision Contributions for Human Gaze Estimation and Tracking
2020
The automatic detection of eye positions, their temporal consistency, and their mapping into a line of sight in the real world (to find where a person is looking at) is reported in the scientific literature as gaze tracking. This has become a very hot topic in the field of computer vision during the last decades, with a surprising and continuously growing number of application fields. A very long journey has been made from the first pioneering works, and this continuous search for more accurate solutions process has been further boosted in the last decade when deep neural networks have revolutionized the whole machine learning area, and gaze tracking as well. In this arena, it is being increasingly useful to find guidance through survey/review articles collecting most relevant works and putting clear pros and cons of existing techniques, also by introducing a precise taxonomy. This kind of manuscripts allows researchers and technicians to choose the better way to move towards their application or scientific goals. In the literature, there exist holistic and specifically technological survey documents (even if not updated), but, unfortunately, there is not an overview discussing how the great advancements in computer vision have impacted gaze tracking. Thus, this work represents an attempt to fill this gap, also introducing a wider point of view that brings to a new taxonomy (extending the consolidated ones) by considering gaze tracking as a more exhaustive task that aims at estimating gaze target from different perspectives: from the eye of the beholder (first-person view), from an external camera framing the beholder’s, from a third-person view looking at the scene where the beholder is placed in, and from an external view independent from the beholder.
Journal Article
An Application-Driven Survey on Event-Based Neuromorphic Computer Vision
2024
Traditional frame-based cameras, despite their effectiveness and usage in computer vision, exhibit limitations such as high latency, low dynamic range, high power consumption, and motion blur. For two decades, researchers have explored neuromorphic cameras, which operate differently from traditional frame-based types, mimicking biological vision systems for enhanced data acquisition and spatio-temporal resolution. Each pixel asynchronously captures intensity changes in the scene above certain user-defined thresholds, and streams of events are captured. However, the distinct characteristics of these sensors mean that traditional computer vision methods are not directly applicable, necessitating the investigation of new approaches before being applied in real applications. This work aims to fill existing gaps in the literature by providing a survey and a discussion centered on the different application domains, differentiating between computer vision problems and whether solutions are better suited for or have been applied to a specific field. Moreover, an extensive discussion highlights the major achievements and challenges, in addition to the unique characteristics, of each application field.
Journal Article
A Survey of Computer Vision Methods for 2D Object Detection from Unmanned Aerial Vehicles
by
Cazzato, Dario
,
Sanchez-Lopez, Jose Luis
,
Cimarelli, Claudio
in
2d object detection
,
Aircraft
,
Cameras
2020
The spread of Unmanned Aerial Vehicles (UAVs) in the last decade revolutionized many applications fields. Most investigated research topics focus on increasing autonomy during operational campaigns, environmental monitoring, surveillance, maps, and labeling. To achieve such complex goals, a high-level module is exploited to build semantic knowledge leveraging the outputs of the low-level module that takes data acquired from multiple sensors and extracts information concerning what is sensed. All in all, the detection of the objects is undoubtedly the most important low-level task, and the most employed sensors to accomplish it are by far RGB cameras due to costs, dimensions, and the wide literature on RGB-based object detection. This survey presents recent advancements in 2D object detection for the case of UAVs, focusing on the differences, strategies, and trade-offs between the generic problem of object detection, and the adaptation of such solutions for operations of the UAV. Moreover, a new taxonomy that considers different heights intervals and driven by the methodological approaches introduced by the works in the state of the art instead of hardware, physical and/or technological constraints is proposed.
Journal Article
Unsupervised Eye Pupil Localization through Differential Geometry and Local Self-Similarity Matching
by
Cazzato, Dario
,
De Marco, Tommaso
,
Leo, Marco
in
Algorithms
,
Artificial Intelligence
,
Biology and Life Sciences
2014
The automatic detection and tracking of human eyes and, in particular, the precise localization of their centers (pupils), is a widely debated topic in the international scientific community. In fact, the extracted information can be effectively used in a large number of applications ranging from advanced interfaces to biometrics and including also the estimation of the gaze direction, the control of human attention and the early screening of neurological pathologies. Independently of the application domain, the detection and tracking of the eye centers are, currently, performed mainly using invasive devices. Cheaper and more versatile systems have been only recently introduced: they make use of image processing techniques working on periocular patches which can be specifically acquired or preliminarily cropped from facial images. In the latter cases the involved algorithms must work even in cases of non-ideal acquiring conditions (e.g in presence of noise, low spatial resolution, non-uniform lighting conditions, etc.) and without user's awareness (thus with possible variations of the eye in scale, rotation and/or translation). Getting satisfying results in pupils' localization in such a challenging operating conditions is still an open scientific topic in Computer Vision. Actually, the most performing solutions in the literature are, unfortunately, based on supervised machine learning algorithms which require initial sessions to set the working parameters and to train the embedded learning models of the eye: this way, experienced operators have to work on the system each time it is moved from an operational context to another. It follows that the use of unsupervised approaches is more and more desirable but, unfortunately, their performances are not still satisfactory and more investigations are required. To this end, this paper proposes a new unsupervised approach to automatically detect the center of the eye: its algorithmic core is a representation of the eye's shape that is obtained through a differential analysis of image intensities and the subsequent combination with the local variability of the appearance represented by self-similarity coefficients. The experimental evidence of the effectiveness of the method was demonstrated on challenging databases containing facial images. Moreover, its capabilities to accurately detect the centers of the eyes were also favourably compared with those of the leading state-of-the-art methods.
Journal Article
Analysis of Facial Information for Healthcare Applications: A Survey on Computer Vision-Based Approaches
by
Cazzato, Dario
,
Carcagnì, Pierluigi
,
Spagnolo, Paolo
in
computer vision
,
eye gaze tracking
,
face analysis
2020
This paper gives an overview of the cutting-edge approaches that perform facial cue analysis in the healthcare area. The document is not limited to global face analysis but it also concentrates on methods related to local cues (e.g., the eyes). A research taxonomy is introduced by dividing the face in its main features: eyes, mouth, muscles, skin, and shape. For each facial feature, the computer vision-based tasks aiming at analyzing it and the related healthcare goals that could be pursued are detailed.
Journal Article
A Systematic Parametric Campaign to Benchmark Event Cameras in Computer Vision Tasks
2025
The dynamic vision sensor (DVS), or event camera, is emerging as a successful sensing solution for many application fields. While state-of-the-art datasets for event-based vision are well-structured and suitable for the designed goals, they often rely on simulated data or are recorded in loosely controlled conditions, thereby making it challenging to understand the sensor response to varying camera parameters and illumination conditions. To address this knowledge gap, this work introduces the JRC INVISIONS Neuromorphic Sensors Parametric Tests dataset, an extensive collection of event-based data specifically acquired in controlled scenarios that systematically vary bias settings and environmental factors, enabling rigorous evaluation of sensor performance, robustness, and artifacts under realistic conditions that existing datasets lack. The dataset is composed of 2156 scenes recorded with two different off-the-shelf event cameras, eventually paired with a frame camera across three different controlled scenarios: moving targets, mechanical vibrations, and rotation speed estimation; the inclusion of ground truth enables the evaluation of standard computer vision tasks. The proposed manuscript is complemented by an experimental analysis of sensor performance under varying speeds and illumination, event statistics, and acquisition artifacts such as event loss and motion-induced distortions due to line-based readout. The dataset is publicly available and, to the best of our knowledge, represents the first dataset of its kind in the literature, providing a valuable resource for the research community to advance the development of event-based vision systems and applications.
Journal Article
An Investigation on the Feasibility of Uncalibrated and Unconstrained Gaze Tracking for Human Assistive Applications by Using Head Pose Estimation
by
Cazzato, Dario
,
Leo, Marco
,
Distante, Cosimo
in
Calibration
,
Cameras
,
Colorimetry - instrumentation
2014
This paper investigates the possibility of accurately detecting and tracking human gaze by using an unconstrained and noninvasive approach based on the head pose information extracted by an RGB-D device. The main advantages of the proposed solution are that it can operate in a totally unconstrained environment, it does not require any initial calibration and it can work in real-time. These features make it suitable for being used to assist human in everyday life (e.g., remote device control) or in specific actions (e.g., rehabilitation), and in general in all those applications where it is not possible to ask for user cooperation (e.g., when users with neurological impairments are involved). To evaluate gaze estimation accuracy, the proposed approach has been largely tested and results are then compared with the leading methods in the state of the art, which, in general, make use of strong constraints on the people movements, invasive/additional hardware and supervised pattern recognition modules. Experimental tests demonstrated that, in most cases, the errors in gaze estimation are comparable to the state of the art methods, although it works without additional constraints, calibration and supervised learning.
Journal Article
Ocular Biometrics Recognition by Analyzing Human Exploration during Video Observations
2020
Soft biometrics provide information about the individual but without the distinctiveness and permanence able to discriminate between any two individuals. Since the gaze represents one of the most investigated human traits, works evaluating the feasibility of considering it as a possible additional soft biometric trait have been recently appeared in the literature. Unfortunately, there is a lack of systematic studies on clinically approved stimuli to provide evidence of the correlation between exploratory paths and individual identities in “natural” scenarios (without calibration, imposed constraints, wearable tools). To overcome these drawbacks, this paper analyzes gaze patterns by using a computer vision based pipeline in order to prove the correlation between visual exploration and user identity. This correlation is robustly computed in a free exploration scenario, not biased by wearable devices nor constrained to a prior personalized calibration. Provided stimuli have been designed by clinical experts and then they allow better analysis of human exploration behaviors. In addition, the paper introduces a novel public dataset that provides, for the first time, images framing the faces of the involved subjects instead of only their gaze tracks.
Journal Article
An Ecological Visual Exploration Tool to Support the Analysis of Visual Processing Pathways in Children with Autism Spectrum Disorders
by
Cazzato, Dario
,
Bernava, Giuseppe
,
Leo, Marco
in
activity recognition
,
affective computing
,
assistive computer vision
2018
Recent improvements in the field of assistive technologies have led to innovative solutions aiming at increasing the capabilities of people with disability, helping them in daily activities with applications that span from cognitive impairments to developmental disabilities. In particular, in the case of Autism Spectrum Disorder (ASD), the need to obtain active feedback in order to extract subsequently meaningful data becomes of fundamental importance. In this work, a study about the possibility of understanding the visual exploration in children with ASD is presented. In order to obtain an automatic evaluation, an algorithm for free (i.e., without constraints, nor using additional hardware, infrared (IR) light sources or other intrusive methods) gaze estimation is employed. Furthermore, no initial calibration is required. It allows the user to freely rotate the head in the field of view of the sensor, and it is insensitive to the presence of eyeglasses, hats or particular hairstyles. These relaxations of the constraints make this technique particularly suitable to be used in the critical context of autism, where the child is certainly not inclined to employ invasive devices, nor to collaborate during calibration procedures.The evaluation of children’s gaze trajectories through the proposed solution is presented for the purpose of an Early Start Denver Model (ESDM) program built on the child’s spontaneous interests and game choice delivered in a natural setting.
Journal Article
A study on different experimental configurations for age, race, and gender estimation problems
2015
This paper presents a detailed study about different algorithmic configurations for estimating soft biometric traits. In particular, a recently introduced common framework is the starting point of the study: it includes an initial facial detection, the subsequent facial traits description, the data reduction step, and the final classification step. The algorithmic configurations are featured by different descriptors and different strategies to build the training dataset and to scale the data in input to the classifier. Experimental proofs have been carried out on both publicly available datasets and image sequences specifically acquired in order to evaluate the performance even under real-world conditions, i.e., in the presence of scaling and rotation.
Journal Article