Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
87 result(s) for "vision-based sensing"
Sort by:
Review on vision-based tracking in surgical navigation
Computer vision is an important cornerstone for the foundation of many modern technologies. The development of modern computer-aided-surgery, especially in the context of surgical navigation for minimally invasive surgery, is one example. Surgical navigation provides the necessary spatial information in computer-aided-surgery. Amongst the various forms of perception, vision-based sensing has been proposed as a promising candidate for tracking and localisation application largely due to its ability to provide timely intra-operative feedback and contactless sensing. The motivation for vision-based sensing in surgical navigation stems from many factors, including the challenges faced by other forms of navigation systems. A common surgical navigation system performs tracking of surgical tools with external tracking systems, which may suffer from both technical and usability issues. Vision-based tracking offers a relatively streamlined framework compared to those approaches implemented with external tracking systems. This review study aims to discuss contemporary research and development in vision-based sensing for surgical navigation. The selected review materials are expected to provide a comprehensive appreciation of state-of-the-art technology and technical issues enabling holistic discussions of the challenges and knowledge gaps in contemporary development. Original views on the significance and development prospect of vision-based sensing in surgical navigation are presented.
Smartphone Prospects in Bridge Structural Health Monitoring, a Literature Review
Bridges are critical components of transportation networks, and their conditions have effects on societal well-being, the economy, and the environment. Automation needs in inspections and maintenance have made structural health monitoring (SHM) systems a key research pillar to assess bridge safety/health. The last decade brought a boom in innovative bridge SHM applications with the rise in next-generation smart and mobile technologies. A key advancement within this direction is smartphones with their sensory usage as SHM devices. This focused review reports recent advances in bridge SHM backed by smartphone sensor technologies and provides case studies on bridge SHM applications. The review includes model-based and data-driven SHM prospects utilizing smartphones as the sensing and acquisition portal and conveys three distinct messages in terms of the technological domain and level of mobility: (i) vibration-based dynamic identification and damage-detection approaches; (ii) deformation and condition monitoring empowered by computer vision-based measurement capabilities; (iii) drive-by or pedestrianized bridge monitoring approaches, and miscellaneous SHM applications with unconventional/emerging technological features and new research domains. The review is intended to bring together bridge engineering, SHM, and sensor technology audiences with decade-long multidisciplinary experience observed within the smartphone-based SHM theme and presents exemplary cases referring to a variety of levels of mobility.
A Vision-Based Deep Learning Framework for Monitoring and Recognition of Chemical Laboratory Operations
Standardized operating procedures are essential for ensuring safety and reproducibility in chemical laboratory experiments. However, real-time monitoring of manual laboratory operations, such as pipetting, remains challenging due to complex human–tool interactions, temporal dependencies between procedural steps, and operator variability. In this study, we propose a vision-based deep learning framework that leverages spatiotemporal features for automated monitoring of pipetting operations using non-contact visual sensing. Briefly, human poses and pipette interactions are extracted from video recordings using a YOLO-based perception model, while temporal execution patterns are captured through bidirectional long short-term memory networks. Experimental results demonstrate that the proposed approach can reliably distinguish between standard and non-standard pipetting behaviors across multiple predefined error categories and shows improved robustness compared with static or frame-level analysis. Overall, this work demonstrates the feasibility of vision-based AI systems for objective and scalable monitoring of laboratory pipetting operations, with potential applicability to other manual laboratory procedures.
Visual Attention and Emotion Analysis Based on Qualitative Assessment and Eyetracking Metrics—The Perception of a Video Game Trailer
Video game trailers are very useful tools for attracting potential players. This research focuses on analyzing the emotions that arise while viewing video game trailers and the link between these emotions and storytelling and visual attention. The methodology consisted of a three-step task test with potential users: the first step was to identify the perception of indie games; the second step was to use the eyetracking device (gaze plot, heat map, and fixation points) and link them to fixation points (attention), viewing patterns, and non-visible areas; the third step was to interview users to understand impressions and questionnaires of emotions related to the trailer’s storytelling and expectations. The results show an effective assessment of visual attention together with visualization patterns, non-visible areas that may affect game expectations, fixation points linked to very specific emotions, and perceived narratives based on the gaze plot. The innovation in the mixed methodological approach has made it possible to obtain relevant data regarding the link between the emotions perceived by the user and the areas of attention collected with the device. The proposed methodology enables developers to understand the strengths and weaknesses of the information being conveyed so that they can tailor the trailer to the expectations of potential players.
Optimal Deployment of Charging Stations for Aerial Surveillance by UAVs with the Assistance of Public Transportation Vehicles
To overcome the limitation in flight time and enable unmanned aerial vehicles (UAVs) to survey remote sites of interest, this paper investigates an approach involving the collaboration with public transportation vehicles (PTVs) and the deployment of charging stations. In particular, the focus of this paper is on the deployment of charging stations. In this approach, a UAV first travels with some PTVs, and then flies through some charging stations to reach remote sites. While the travel time with PTVs can be estimated by the Monte Carlo method to accommodate various uncertainties, we propose a new coverage model to compute the travel time taken for UAVs to reach the sites. With this model, we formulate the optimal deployment problem with the goal of minimising the average travel time of UAVs from the depot to the sites, which can be regarded as a reflection of the quality of surveillance (QoS) (the shorter the better). We then propose an iterative algorithm to place the charging stations. We show that this algorithm ensures that any movement of a charging station leads to a decrease in the average travel time of UAVs. To demonstrate the effectiveness of the proposed method, we make a comparison with a baseline method. The results show that the proposed model can more accurately estimate the travel time than the most commonly used model, and the proposed algorithm can relocate the charging stations to achieve a lower flight distance than the baseline method.
TactiGraph: An Asynchronous Graph Neural Network for Contact Angle Prediction Using Neuromorphic Vision-Based Tactile Sensing
Vision-based tactile sensors (VBTSs) have become the de facto method for giving robots the ability to obtain tactile feedback from their environment. Unlike other solutions to tactile sensing, VBTSs offer high spatial resolution feedback without compromising on instrumentation costs or incurring additional maintenance expenses. However, conventional cameras used in VBTS have a fixed update rate and output redundant data, leading to computational overhead.In this work, we present a neuromorphic vision-based tactile sensor (N-VBTS) that employs observations from an event-based camera for contact angle prediction. In particular, we design and develop a novel graph neural network, dubbed TactiGraph, that asynchronously operates on graphs constructed from raw N-VBTS streams exploiting their spatiotemporal correlations to perform predictions. Although conventional VBTSs use an internal illumination source, TactiGraph is reported to perform efficiently in both scenarios (with and without an internal illumination source) thus further reducing instrumentation costs. Rigorous experimental results revealed that TactiGraph achieved a mean absolute error of 0.62∘ in predicting the contact angle and was faster and more efficient than both conventional VBTS and other N-VBTS, with lower instrumentation costs. Specifically, N-VBTS requires only 5.5% of the computing time needed by VBTS when both are tested on the same scenario.
Enhancing Autonomous Orchard Navigation: A Real-Time Convolutional Neural Network-Based Obstacle Classification System for Distinguishing ‘Real’ and ‘Fake’ Obstacles in Agricultural Robotics
Autonomous navigation in agricultural environments requires precise obstacle classification to ensure collision-free movement. This study proposes a convolutional neural network (CNN)-based model designed to enhance obstacle classification for agricultural robots, particularly in orchards. Building upon a previously developed YOLOv8n-based real-time detection system, the model incorporates Ghost Modules and Squeeze-and-Excitation (SE) blocks to enhance feature extraction while maintaining computational efficiency. Obstacles are categorized as “Real”—those that physically impact navigation, such as tree trunks and persons—and “Fake”—those that do not, such as tall weeds and tree branches—allowing for precise navigation decisions. The model was trained on separate orchard and campus datasets and fine-tuned using Hyperband optimization and evaluated on an external test set to assess generalization to unseen obstacles. The model’s robustness was tested under varied lighting conditions, including low-light scenarios, to ensure real-world applicability. Computational efficiency was analyzed based on inference speed, memory consumption, and hardware requirements. Comparative analysis against state-of-the-art classification models (VGG16, ResNet50, MobileNetV3, DenseNet121, EfficientNetB0, and InceptionV3) confirmed the proposed model’s superior precision (p), recall (r), and F1-score, particularly in complex orchard scenarios. The model maintained strong generalization across diverse environmental conditions, including varying illumination and previously unseen obstacles. Furthermore, computational analysis revealed that the orchard-combined model achieved the highest inference speed at 2.31 FPS while maintaining a strong balance between accuracy and efficiency. When deployed in real-time, the model achieved 95.0% classification accuracy in orchards and 92.0% in campus environments. The real-time system demonstrated a false positive rate of 8.0% in the campus environment and 2.0% in the orchard, with a consistent false negative rate of 8.0% across both environments. These results validate the model’s effectiveness for real-time obstacle differentiation in agricultural settings. Its strong generalization, robustness to unseen obstacles, and computational efficiency make it well-suited for deployment in precision agriculture. Future work will focus on enhancing inference speed, improving performance under occlusion, and expanding dataset diversity to further strengthen real-world applicability.
Robust and versatile vision-based dynamic displacement monitoring of natural feature targets in large-scale structures
Dynamic displacement response is an essential indicator for assessing structural state and performance. Vision-based structural displacement monitoring is considered as a promising approach. However, the current vision-based methods usually only focus on certain application scenarios. This study introduces a Sparse Bayesian Learning-based (SBL) algorithm to enhance robustness, accuracy, and computational efficiency in target tracking. Furthermore, a robust and versatile Vision-based Dynamic Displacement Monitoring System (VDDMS) was developed, capable of monitoring displacements of varying application scenarios. The robustness of the proposed algorithm under changing illumination conditions is validated through a specially designed indoor experiment. The feasibility of field application of VDDMS is confirmed through an outdoor shear wall shaking table test. Furthermore, a large-scale bridge shaking table test is conducted to evaluate the reliability and versatility of VDDMS in monitoring natural feature targets on large structures subjected to different seismic excitations. The root mean square error, when compared to laser displacement sensors, ranges from 0.2% to 2.9% of the peak-to-peak displacement. Additionally, VDDMS accurately identifies multi-order frequencies in bridge structures. The study investigates the influence of initial template selection on accuracy, highlighting the significance of distinctive texture features. Moreover, two error evaluation schemes are proposed to quickly assess the reliability of vision-based displacement sensing technologies in various application scenarios.
Application Framework and Optimal Features for UAV-Based Earthquake-Induced Structural Displacement Monitoring
Unmanned aerial vehicle (UAV) vision-based sensing has become an emerging technology for structural health monitoring (SHM) and post-disaster damage assessment of civil infrastructure. This article proposes a framework for monitoring structural displacement under earthquakes by reprojecting image points obtained courtesy of UAV-captured videos to the 3-D world space based on the world-to-image point correspondences. To identify optimal features in the UAV imagery, geo-reference targets with various patterns were installed on a test building specimen, which was then subjected to earthquake shaking. A feature point tracking-based algorithm for square checkerboard patterns and a Hough Transform-based algorithm for concentric circular patterns are developed to ensure reliable detection and tracking of image features. Photogrammetry techniques are applied to reconstruct the 3-D world points and extract structural displacements. The proposed methodology is validated by monitoring the displacements of a full-scale 6-story mass timber building during a series of shake table tests. Reasonable accuracy is achieved in that the overall root-mean-square errors of the tracking results are at the millimeter level compared to ground truth measurements from analog sensors. Insights on optimal features for monitoring structural dynamic response are discussed based on statistical analysis of the error characteristics for the various reference target patterns used to track the structural displacements.
Lightweight CNN–Mamba Hybrid Network for Multi-Scale Concrete Crack Segmentation Using Vision Sensors
Surface cracking is a key visible indicator of deterioration in concrete infrastructure and is routinely captured by vision sensors during field inspections. To translate inspection imagery into actionable maintenance information, crack delineation must be accurate at the pixel level and robust to challenging conditions where cracks are slender, discontinuous, low-contrast, and easily confused with joints, stains, texture patterns, and illumination artifacts. This study proposes a lightweight CNN–Mamba hybrid segmentation framework built upon Vm-unet for reliable crack mapping under heterogeneous inspection scenarios and resource-constrained deployment. The framework couples boundary-sensitive convolutional features with long-range state-space representations via a spatially modulated convolution design, refines skip-connection features using reciprocal co-modulation attention to suppress background interference, and enhances cross-scale interactions through a decoder interaction fusion scheme to preserve fine-crack continuity and sharp boundaries. Experiments on a multi-source composite dataset and public benchmarks show consistent improvements over representative CNN-, Transformer-, and Mamba-based baselines. The proposed method achieves 80.11% mIoU and 82.05% Dice on the composite dataset, while maintaining an efficient accuracy–cost trade-off (36.049 GFLOPs, 25.991 M parameters). The resulting crack masks provide a dependable basis for inspection-driven quantitative assessment and maintenance decision support.