Catalogue Search | MBRL

On the Behavior of Convolutional Nets for Feature Extraction

by Garcia-Gasulla, Dario , Cortés, Ulises , Vilalta, Armand in Artificial intelligence , Artificial neural networks , Deep learning networks

2018

Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previously learnt by the CNN after processing millions of images, without requiring an expensive training phase. Contributions to this field (commonly known as feature representation transfer or transfer learning) have been purely empirical so far, extracting all CNN features from a single layer close to the output and testing their performance by feeding them to a classifier. This approach has provided consistent results, although its relevance is limited to classification tasks. In a completely different approach, in this paper we statistically measure the discriminative power of every single feature found within a deep CNN, when used for characterizing every class of 11 datasets. We seek to provide new insights into the behavior of CNN features, particularly the ones from convolutional layers, as this can be relevant for their application to knowledge representation and reasoning. Our results confirm that low and middle level features may behave differently to high level features, but only under certain conditions. We find that all CNN features can be used for knowledge representation purposes both by their presence or by their absence, doubling the information a single CNN feature may provide. We also study how much noise these features may include, and propose a thresholding approach to discard most of it. All these insights have a direct application to the generation of CNN embedding spaces.

Journal Article

Share this book

Add to My Shelf

The MAMe dataset: on the relevance of high resolution and variable shape image properties

by Garcia-Gasulla, Dario , Arias-Duart, Anna , Viladrich, Nina in Classification , Datasets , Deformation effects

2022

The mostcommon approach in image classification tasks is to resize all images in the dataset to a unique shape, while reducing their resolution to a size that makes experimentation at scale easier. This practice has benefits from a computational perspective, but it entails negative side-effects on performance due to loss of information and image deformation. In this work we introduce the MAMe dataset, an image classification dataset with remarkable properties of high resolution and variable shape. The goal of MAMe is to provide a tool for studying the impact of such properties in image classification, while motivating research in the topic. The MAMe dataset contains thousands of artworks from three different museums, and proposes a classification task consisting on differentiating between 29 mediums (i.e., materials and techniques) supervised by art experts. After analyzing the novelty of MAMe in the context of the current image classification tasks, a thorough description of the task is provided, along with statistics of the dataset. Experiments are conducted to evaluate the impact of using high resolution images, variable shape inputs, as well as both properties at the same time. Results illustrate the positive impact in performance when using high resolution images, while highlighting the lack of solutions to exploit variable shapes. An additional experiment exposes the distinctiveness between the MAMe dataset and the prototypical ImageNet dataset, showing that performance improves due to information gain and resolution gain. Finally, the baselines are inspected using explainability methods and expert knowledge, in order to gain insights about the challenges that remain ahead.

Journal Article

Share this book

Add to My Shelf

Simulating the behavior of the Human Brain on GPUs

by Sirvent, Raül , Peña, Antonio J , Martorell, Xavier in Algorithms , Brain , Computation

2018

The simulation of the behavior of the Human Brain is one of the most important challenges in computing today. The main problem consists of finding efficient ways to manipulate and compute the huge volume of data that this kind of simulations need, using the current technology. In this sense, this work is focused on one of the main steps of such simulation, which consists of computing the Voltage on neurons’ morphology. This is carried out using the Hines Algorithm and, although this algorithm is the optimum method in terms of number of operations, it is in need of non-trivial modifications to be efficiently parallelized on GPUs. We proposed several optimizations to accelerate this algorithm on GPU-based architectures, exploring the limitations of both, method and architecture, to be able to solve efficiently a high number of Hines systems (neurons). Each of the optimizations are deeply analyzed and described. Two different approaches are studied, one for mono-morphology simulations (batch of neurons with the same shape) and one for multi-morphology simulations (batch of neurons where every neuron has a different shape). In mono-morphology simulations we obtain a good performance using just a single kernel to compute all the neurons. However this turns out to be inefficient on multi-morphology simulations. Unlike the previous scenario, in multi-morphology simulations a much more complex implementation is necessary to obtain a good performance. In this case, we must execute more than one single GPU kernel. In every execution (kernel call) one specific part of the batch of the neurons is solved. These parts can be seen as multiple and independent tridiagonal systems. Although the present paper is focused on the simulation of the behavior of the Human Brain, some of these techniques, in particular those related to the solving of tridiagonal systems, can be also used for multiple oil and gas simulations. Our studies have proven that the optimizations proposed in the present work can achieve high performance on those computations with a high number of neurons, being our GPU implementations about 4× and 8× faster than the OpenMP multicore implementation (16 cores), using one and two NVIDIA K80 GPUs respectively. Also, it is important to highlight that these optimizations can continue scaling, even when dealing with a very high number of neurons.

Journal Article

Share this book

Add to My Shelf

CellSs: Making it easier to program the Cell Broadband Engine processor

by Perez, J. M. , Bellens, P. , Labarta, J. in Computer programming , Flexibility , Microprocessors

2007

With the appearance of new multicore processor architectures, there is a need for new programming paradigms, especially for heterogeneous devices such as the Cell Broadband Engine (Cell/B.E.) processor. CellSs is a programming model that addresses the automatic exploitation of functional parallelism from a sequential application with annotations. The focus is on the flexibility and simplicity of the programming model. Although the concept and programming model are general enough to he extended to other devices, its current implementation has been tailored to the Cell/B.E. device. This paper presents an overview of CellSs and a newly implemented scheduling algorithm. An analysis of the results-both performance measures and a detailed analysis with performance analysis tools-was performed and is presented here.

Journal Article

Share this book

Add to My Shelf

Artificial Intelligence to Identify Retinal Fundus Images, Quality Validation, Laterality Evaluation, Macular Degeneration, and Suspected Glaucoma

by Garcia-Gasulla, Dario , Zapata, Miguel Angel , Cortes, Ulises in Accuracy , Algorithms , Artificial intelligence

2020

To assess the performance of deep learning algorithms for different tasks in retinal fundus images: (1) detection of retinal fundus images versus optical coherence tomography (OCT) or other images, (2) evaluation of good quality retinal fundus images, (3) distinction between right eye (OD) and left eye (OS) retinal fundus images,(4) detection of age-related macular degeneration (AMD) and (5) detection of referable glaucomatous optic neuropathy (GON). Five algorithms were designed. Retrospective study from a database of 306,302 images, Optretina's tagged dataset. Three different ophthalmologists, all retinal specialists, classified all images. The dataset was split per patient in a training (80%) and testing (20%) splits. Three different CNN architectures were employed, two of which were custom designed to minimize the number of parameters with minimal impact on its accuracy. Main outcome measure was area under the curve (AUC) with accuracy, sensitivity and specificity. Determination of retinal fundus image had AUC of 0.979 with an accuracy of 96% (sensitivity 97.7%, specificity 92.4%). Determination of good quality retinal fundus image had AUC of 0.947, accuracy 91.8% (sensitivity 96.9%, specificity 81.8%). Algorithm for OD/OS had AUC 0.989, accuracy 97.4%. AMD had AUC of 0.936, accuracy 86.3% (sensitivity 90.2% specificity 82.5%), GON had AUC of 0.863, accuracy 80.2% (sensitivity 76.8%, specificity 83.8%). Deep learning algorithms can differentiate a retinal fundus image from other images. Algorithms can evaluate the quality of an image, discriminate between right or left eye and detect the presence of AMD and GON with a high level of accuracy, sensitivity and specificity.

Journal Article

Share this book

Add to My Shelf

Proposal to Extend the OpenMP Tasking Model with Dependent Tasks

by Duran, Alejandro , Labarta, Jesus , Badia, Rosa M in Analysis , Arquitectura de computadors , Arquitectures paral·leles

2009

Tasking in OpenMP 3.0 has been conceived to handle the dynamic generation of unstructured parallelism. New directives have been added allowing the user to identify units of independent work (tasks) and to define points to wait for the completion of tasks (task barriers). In this document we propose extensions to allow the runtime detection of dependencies between generated tasks, broading the range of applications that can benefit from tasking or improving the performance when load balancing or locality are critical issues for performance. The proposed extensions are evaluated on a SGI Altix multiprocessor architecture using a couple of small applications and a prototype runtime system implementation.

Journal Article

Share this book

Add to My Shelf

Extending OpenMP to Survive the Heterogeneous Multi-Core Era

by Duran, Alejandro , Mayo, Rafael , Igual, Francisco in Accelerators , Computer Science , Enginyeria de la telecomunicació

2010

This paper advances the state-of-the-art in programming models for exploiting task-level parallelism on heterogeneous many-core systems, presenting a number of extensions to the OpenMP language inspired in the StarSs programming model. The proposed extensions allow the programmer to write portable code easily for a number of different platforms, relieving him/her from developing the specific code to off-load tasks to the accelerators and the synchronization of tasks. Our results obtained from the StarSs instantiations for SMPs, the Cell, and GPUs report reasonable parallel performance. However, the real impact of our approach in is the productivity gains it yields for the programmer.

Journal Article

Share this book

Add to My Shelf

A Proposal for Error Handling in OpenMP

by Duran, Alejandro , Ayguadé, Eduard , Ferrer, Roger in C plus plus , Computer programming , Errors

2007

OpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance is important, but reliability is even more important, and OpenMP does not have any recovery mechanism. In this paper we present a novel proposal to address this lack. In order to add error handling to OpenMP we propose some extensions to the current OpenMP specification. A directive and a clause are proposed, defining a scope for the error handling (where the error can occur) and specifying a behaviour for handling the specific errors. Some examples of use are presented, and we present also an evaluation showing the impact of this proposal in OpenMP applications. We show that this impact is low enough to consider the proposal worthwhile for OpenMP. [PUBLICATION ABSTRACT]

Journal Article

Share this book

Add to My Shelf

On the trade-off of mixing scientific applications on capacity high-performance computing systems

by Carlos Sancho, Jose , Beivide, Ramon , Labarta, Jesus in Allocations , application classification , application performance variability

2013

Network contention is seen as a major hurdle to achieve higher throughput in today's large-scale high-performance computing systems. Even more so with the current trend of employing blocking networks driven by the need of reducing cost. Additionally, the effect is aggravated by current system schedulers that allocate jobs as soon as nodes become available, thus producing job fragmentation, that is, the tasks of one job might be spread throughout the system instead of being allocated contiguously. This fragmentation increases the probability of sharing network resources with other applications, which produces higher inter-application network contention. In this study, the authors perform a broad analysis of diverse applications’ performance variability because of the topology connectivity and fragmentation and a classification of applications based on their sensitiveness to these two factors. Once they understood the inherent characteristics of applications, the authors analysed the applications performance in a shared environment, that is, when mixing with other applications. They show that inter-application contention might be a significant factor of degradation even in the networks with high connectivity. Their results suggest different strategies on task allocation policies: grouping sensitive and insensitive applications, reducing the number of applications sharing the first level switch or isolation of sensitive applications.

Journal Article

Share this book

Add to My Shelf

On the trade-off of mixing scientific applications on capacity high-performance computing systems

by BEIVIDE, Ramon , LABARTA, Jesus , RODRIGUEZ, German in Applied sciences , Circuit properties , Electric, optical and optoelectronic circuits

2013

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter