Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
26 result(s) for "Petković, Tomislav"
Sort by:
Linear Regression vs. Deep Learning: A Simple Yet Effective Baseline for Human Body Measurement
We propose a linear regression model for the estimation of human body measurements. The input to the model only consists of the information that a person can self-estimate, such as height and weight. We evaluate our model against the state-of-the-art approaches for body measurement from point clouds and images, demonstrate the comparable performance with the best methods, and even outperform several deep learning models on public datasets. The simplicity of the proposed regression model makes it perfectly suitable as a baseline in addition to the convenience for applications such as the virtual try-on. To improve the repeatability of the results of our baseline and the competing methods, we provide guidelines toward standardized body measurement estimation.
Automated Trading Framework Using LLM-Driven Features and Deep Reinforcement Learning
Stock trading faces significant challenges due to market volatility and the complexity of integrating diverse data sources, such as financial texts and numerical market data. This paper proposes an innovative automated trading system that integrates advanced natural language processing (NLP) and deep reinforcement learning (DRL) to address these challenges. The system combines two novel components: PrimoGPT, a Transformer-based NLP model fine-tuned on financial texts using instruction-based datasets to generate actionable features like sentiment and trend direction, and PrimoRL, a DRL model that expands its state space with these NLP-derived features for enhanced decision-making precision compared to traditional DRL models like FinRL. An experimental evaluation over seven months of leading technology stocks reveals cumulative returns of up to 58.47% for individual stocks and 27.14% for a diversified portfolio, with a Sharpe ratio of 1.70, outperforming traditional and advanced benchmarks. This work advances AI-driven quantitative finance by offering a scalable framework that bridges qualitative analysis and strategic action, thereby fostering smarter and more equitable participation in financial markets.
Calibration of a Structured Light Imaging System in Two-Layer Flat Refractive Geometry for Underwater Imaging
The development of a robust 3D imaging system for underwater applications is a crucial process in underwater imaging where the physical properties of the underwater environment make the implementation of such systems challenging. Calibration is an essential step in the application of such imaging systems and is performed to acquire the parameters of the image formation model and to enable 3D reconstruction. We present a novel calibration method for an underwater 3D imaging system comprising a pair of cameras, of a projector, and of a single glass interface that is shared between cameras and projector(s). The image formation model is based on the axial camera model. The proposed calibration uses a numerical optimization of a 3D cost function to determine all system parameters, thus avoiding the minimization of re-projection errors which require numerically solving a 12th order polynomial equation multiple times for each observed point. We also propose a novel stable approach to estimate the axis of the axial camera model. The proposed calibration was experimentally evaluated on four different glass interfaces, wherein several quantitative results were reported, including the re-projection error. The achieved mean angular error of the system’s axis was under 6∘, and the mean absolute errors for the reconstruction of a flat surface were 1.38 mm for normal glass interfaces and 2.82 mm for the laminated glass interface, which is more than sufficient for application.
Addressing the generalization of 3D registration methods with a featureless baseline and an unbiased benchmark
Recent 3D registration methods are mostly learning-based that either find correspondences in feature space and match them, or directly estimate the registration transformation from the given point cloud features. Therefore, these feature-based methods have difficulties with generalizing onto point clouds that differ substantially from their training data. This issue is not so apparent because of the problematic benchmark definitions that cannot provide any in-depth analysis and contain a bias toward similar data. Therefore, we propose a methodology to create a 3D registration benchmark, given a point cloud dataset, that provides a more informative evaluation of a method w.r.t. other benchmarks. Using this methodology, we create a novel FAUST-partial (FP) benchmark, based on the FAUST dataset, with several difficulty levels. The FP benchmark addresses the limitations of the current benchmarks: lack of data and parameter range variability, and allows to evaluate the strengths and weaknesses of a 3D registration method w.r.t. a single registration parameter. Using the new FP benchmark, we provide a thorough analysis of the current state-of-the-art methods and observe that the current method still struggle to generalize onto severely different out-of-sample data. Therefore, we propose a simple featureless traditional 3D registration baseline method based on the weighted cross-correlation between two given point clouds. Our method achieves strong results on current benchmarking datasets, outperforming most deep learning methods. Our source code is available on github.com/DavidBoja/exhaustive-grid-search.
An evaluation of real-time RGB-D visual odometry algorithms on mobile devices
We present an evaluation and a comparison of different visual odometry algorithms selected to be tested on a mobile device equipped with a RGB-D camera. The most promising algorithms from the literature are tested on different mobile devices, some equipped with the Structure Sensor. We evaluate the accuracy of each selected algorithm as well as the memory and CPU consumption, and we show that even without specific optimization, some of them can provide a reliable measure of the device motion in real time.
3D sensing of back symmetry curve suited for dynamic analysis of spinal deformities
This paper is focused on meeting the demands for better detection and evaluation of the spinal deformities that go beyond a basic static 3D analysis. It is expected that by observing the spine while it moves an improved 3D deformity analysis and measurements can be achieved. In this paper, we present a novel approach for the automatic back symmetry curve sensing. Our approach can be used in the back shape analysis of many different positions, i.e. segments of the dynamic motion, and is not limited only to the upright standing position. The proposed method is based on the 3D surface reconstruction, surface curvature analysis and graph theoretic approach for the (semi-)automatic detection of the symmetry curve. In addition, we introduce a 3D scanning system which was used in our experiment to successfully generate 3D back surface reconstructions for each frame of the captured forward bending motion. We also tested the proposed method on the data collected using commercial 3D spine analysis system and the results were comparable. The additional experiment focusing on the dynamic analysis demonstrated that the proposed method can enable further advances in the automatic 3D back surface analysis by tracking the spine position throughout performed movements.
Generalizable Human Pose Triangulation
We address the problem of generalizability for multi-view 3D human pose estimation. The standard approach is to first detect 2D keypoints in images and then apply triangulation from multiple views. Even though the existing methods achieve remarkably accurate 3D pose estimation on public benchmarks, most of them are limited to a single spatial camera arrangement and their number. Several methods address this limitation but demonstrate significantly degraded performance on novel views. We propose a stochastic framework for human pose triangulation and demonstrate a superior generalization across different camera arrangements on two public datasets. In addition, we apply the same approach to the fundamental matrix estimation problem, showing that the proposed method can successfully apply to other computer vision problems. The stochastic framework achieves more than 8.8% improvement on the 3D pose estimation task, compared to the state-of-the-art, and more than 30% improvement for fundamental matrix estimation, compared to a standard algorithm.
A Note on Geometric Calibration of Multiple Cameras and Projectors
Geometric calibration of cameras and projectors is an essential step that must be performed before any imaging system can be used. There are many well-known geometric calibration methods for calibrating systems comprised of multiple cameras, but simultaneous geometric calibration of multiple projectors and cameras has received less attention. This leaves unresolved several practical issues which must be considered to achieve the simplicity of use required for real world applications. In this work we discuss several important components of a real-world geometric calibration procedure used in our laboratory to calibrate surface imaging systems comprised of many projectors and cameras. We specifically discuss the design of the calibration object and the image processing pipeline used to analyze it in the acquired images. We also provide quantitative calibration results in the form of reprojection errors and compare them to the classic approaches such as Zhang's calibration method.
Human Intention Estimation based on Hidden Markov Model Motion Validation for Safe Flexible Robotized Warehouses
With the substantial growth of logistics businesses the need for larger warehouses and their automation arises, thus using robots as assistants to human workers is becoming a priority. In order to operate efficiently and safely, robot assistants or the supervising system should recognize human intentions in real-time. Theory of mind (ToM) is an intuitive human conception of other humans' mental state, i.e., beliefs and desires, and how they cause behavior. In this paper we propose a ToM based human intention estimation algorithm for flexible robotized warehouses. We observe human's, i.e., worker's motion and validate it with respect to the goal locations using generalized Voronoi diagram based path planning. These observations are then processed by the proposed hidden Markov model framework which estimates worker intentions in an online manner, capable of handling changing environments. To test the proposed intention estimation we ran experiments in a real-world laboratory warehouse with a worker wearing Microsoft Hololens augmented reality glasses. Furthermore, in order to demonstrate the scalability of the approach to larger warehouses, we propose to use virtual reality digital warehouse twins in order to realistically simulate worker behavior. We conducted intention estimation experiments in the larger warehouse digital twin with up to 24 running robots. We demonstrate that the proposed framework estimates warehouse worker intentions precisely and in the end we discuss the experimental results.
On the Comparison of Classic and Deep Keypoint Detector and Descriptor Methods
The purpose of this study is to give a performance comparison between several classic hand-crafted and deep key-point detector and descriptor methods. In particular, we consider the following classical algorithms: SIFT, SURF, ORB, FAST, BRISK, MSER, HARRIS, KAZE, AKAZE, AGAST, GFTT, FREAK, BRIEF and RootSIFT, where a subset of all combinations is paired into detector-descriptor pipelines. Additionally, we analyze the performance of two recent and perspective deep detector-descriptor models, LF-Net and SuperPoint. Our benchmark relies on the HPSequences dataset that provides real and diverse images under various geometric and illumination changes. We analyze the performance on three evaluation tasks: keypoint verification, image matching and keypoint retrieval. The results show that certain classic and deep approaches are still comparable, with some classic detector-descriptor combinations overperforming pretrained deep models. In terms of the execution times of tested implementations, SuperPoint model is the fastest, followed by ORB. The source code is published on \\url{https://github.com/kristijanbartol/keypoint-algorithms-benchmark}.