Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
35 result(s) for "Pratikakis, Ioannis"
Sort by:
Deep Learning for Breast Cancer Diagnosis from Mammograms—A Comparative Study
Deep convolutional neural networks (CNNs) are investigated in the context of computer-aided diagnosis (CADx) of breast cancer. State-of-the-art CNNs are trained and evaluated on two mammographic datasets, consisting of ROIs depicting benign or malignant mass lesions. The performance evaluation of each examined network is addressed in two training scenarios: the first involves initializing the network with pre-trained weights, while for the second the networks are initialized in a random fashion. Extensive experimental results show the superior performance achieved in the case of fine-tuning a pretrained network compared to training from scratch.
Real-Time Semantic Image Segmentation with Deep Learning for Autonomous Driving: A Survey
Semantic image segmentation for autonomous driving is a challenging task due to its requirement for both effectiveness and efficiency. Recent developments in deep learning have demonstrated important performance boosting in terms of accuracy. In this paper, we present a comprehensive overview of the state-of-the-art semantic image segmentation methods using deep-learning techniques aiming to operate in real time so that can efficiently support an autonomous driving scenario. To this end, the presented overview puts a particular emphasis on the presentation of all those approaches which permit inference time reduction, while an analysis of the existing methods is addressed by taking into account their end-to-end functionality, as well as a comparative study that relies upon a consistent evaluation framework. Finally, a fruitful discussion is presented that provides key insights for the current trend and future research directions in real-time semantic image segmentation with deep learning for autonomous driving.
PANORAMA: A 3D Shape Descriptor Based on Panoramic Views for Unsupervised 3D Object Retrieval
We present a novel 3D shape descriptor that uses a set of panoramic views of a 3D object which describe the position and orientation of the object’s surface in 3D space. We obtain a panoramic view of a 3D object by projecting it to the lateral surface of a cylinder parallel to one of its three principal axes and centered at the centroid of the object. The object is projected to three perpendicular cylinders, each one aligned with one of its principal axes in order to capture the global shape of the object. For each projection we compute the corresponding 2D Discrete Fourier Transform as well as 2D Discrete Wavelet Transform. We further increase the retrieval performance by employing a local (unsupervised) relevance feedback technique that shifts the descriptor of an object closer to its cluster centroid in feature space. The effectiveness of the proposed 3D object retrieval methodology is demonstrated via an extensive consistent evaluation in standard benchmarks that clearly shows better performance against state-of-the-art 3D object retrieval methods.
MResTNet: A Multi-Resolution Transformer Framework with CNN Extensions for Semantic Segmentation
A fundamental task in computer vision is the process of differentiation and identification of different objects or entities in a visual scene using semantic segmentation methods. The advancement of transformer networks has surpassed traditional convolutional neural network (CNN) architectures in terms of segmentation performance. The continuous pursuit of optimal performance, with respect to the popular evaluation metric results, has led to very large architectures that require a significant amount of computational power to operate, making them prohibitive for real-time applications, including autonomous driving. In this paper, we propose a model that leverages a visual transformer encoder with a parallel twin decoder, consisting of a visual transformer decoder and a CNN decoder with multi-resolution connections working in parallel. The two decoders are merged with the aid of two trainable CNN blocks, the fuser that combined the information from the two decoders and the scaler that scales the contribution of each decoder. The proposed model achieves state-of-the-art performance on the Cityscapes and ADE20K datasets, maintaining a low-complexity network that can be used in real-time applications.
HTR for Greek Historical Handwritten Documents
Offline handwritten text recognition (HTR) for historical documents aims for effective transcription by addressing challenges that originate from the low quality of manuscripts under study as well as from several particularities which are related to the historical period of writing. In this paper, the challenge in HTR is related to a focused goal of the transcription of Greek historical manuscripts that contain several particularities. To this end, in this paper, a convolutional recurrent neural network architecture is proposed that comprises octave convolution and recurrent units which use effective gated mechanisms. The proposed architecture has been evaluated on three newly created collections from Greek historical handwritten documents that will be made publicly available for research purposes as well as on standard datasets like IAM and RIMES. For evaluation we perform a concise study which shows that compared to state of the art architectures, the proposed one deals effectively with the challenging Greek historical manuscripts.
Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents
Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits efficient and effective word spotting in handwritten documents is presented that relies upon document-oriented local features that take into account information around representative keypoints and a matching process that incorporates a spatial context in a local proximity search without using any training data. The method relies on a document-oriented keypoint and feature extraction, along with a fast feature matching method. This enables the corresponding methodological pipeline to be both effectively and efficiently employed in the cloud so that word spotting can be realised as a service in modern mobile devices. The effectiveness and efficiency of the proposed method in terms of its matching accuracy, along with its fast retrieval time, respectively, are shown after a consistent evaluation of several historical handwritten datasets.
An overview of partial 3D object retrieval methodologies
This work offers an overview of the state-of-the-art on the emerging area of 3D object retrieval based on partial queries. This research area is associated with several application domains, including face recognition and digital libraries of cultural heritage objects. The existing partial 3D object retrieval methods can be mainly classified as: i) view-based, ii) part-based, iii) bag of visual words (BoVW)-based, and iv) hybrid methods combining these three main paradigms or methods which cannot be straightforwardly classified. Several methodological aspects are identified, including the use of interest points and the exploitation of 2.5D projections, whereas the available evaluation datasets and campaigns are addressed. A thorough discussion follows, identifying advantages and limitations.
Partial matching of 3D cultural heritage objects using panoramic views
In this paper, we present a method for partial matching and retrieval of 3D objects based on range image queries. The proposed methodology addresses the retrieval of complete 3D objects using range image queries that represent partial views. The core methodology relies upon Bag-of-Visual-Words modelling and enhanced Dense SIFT descriptor computed on panoramic views and range image queries. Performance evaluation builds upon standard measures and a challenging 3D pottery dataset originating from the Hampson Archaeological Museum collection.
Action unit detection in 3D facial videos with application in facial expression retrieval and recognition
This work introduces a new scheme for action unit detection in 3D facial videos. Sets of features that define action unit activation in a robust manner are proposed. These features are computed based on eight detected facial landmarks on each facial mesh that involve angles, areas and distances. Support vector machine classifiers are then trained using the above features in order to perform action unit detection. The proposed AU detection scheme is used in a dynamic 3D facial expression retrieval and recognition pipeline, highlighting the most important AU s, in terms of providing facial expression information, and at the same time, resulting in better performance than state-of-the-art methodologies.
Unsupervised human action retrieval using salient points in 3D mesh sequences
The problem of human action retrieval based on the representation of the human body as a 3D mesh is addressed. The proposed 3D mesh sequence descriptor is based on a set of trajectories of salient points of the human body: its centroid and its five protrusion ends. The extracted descriptor of the corresponding trajectories incorporates a set of significant features of human motion, such as velocity, total displacement from the initial position and direction. As distance measure, a variation of the Dynamic Time Warping (DTW) algorithm, combined with a k − means based method for multiple distance matrix fusion, is applied. The proposed method is fully unsupervised. Experimental evaluation has been performed on two artificial datasets, one of which is being made publicly available by the authors. The experimentation on these datasets shows that the proposed scheme achieves retrieval performance beyond the state of the art.