Catalogue Search | MBRL

A guide to convolutional neural networks for computer vision

by Khan, Salman (Salman Hameed), author , Rahmani, Hossein, author , Shah, Syed Afaq Ali, author in Computer vision Mathematical models. , Neural networks (Computer science) , Convolutions (Mathematics)

Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models.

Book

Share this book

Add to My Shelf

A computational perspective on visual attention

by Tsotsos, John K. in Attention , Attention -- Mathematical models , Computer vision

2011

The author offers a comprehensive, up-to-date overview of attention theories and models and a full description of the selective tuning model, confining the formal elements to two chapters and two appendixes.

eBook

Share this book

Add to My Shelf

Model-Based Visual Tracking

by Panin, Giorgio in Automatic tracking , Computer vision , Mathematical models

2011

This book has two main goals: to provide a unifed and structured overview of this growing field, as well as to propose a corresponding software framework, the OpenTL library, developed by the author and his working group at TUM-Informatik. The main objective of this work is to show, how most real-world application scenarios can be naturally cast into a common description vocabulary, and therefore implemented and tested in a fully modular and scalable way, through the defnition of a layered, object-oriented software architecture.The resulting architecture covers in a seamless way all processing levels, from raw data acquisition up to model-based object detection and sequential localization, and defines, at the application level, what we call the tracking pipeline. Within this framework, extensive use of graphics hardware (GPU computing) as well as distributed processing, allows real-time performances for complex models and sensory systems.

eBook

Share this book

Add to My Shelf

Tracking with particle filter for high-dimensional observation and state spaces

by Dubuisson, Séverine in Computer vision , Particle methods (Numerical analysis) , Pattern recognition systems

2015

This title concerns the use of a particle filter framework to track objects defined in high-dimensional state-spaces using high-dimensional observation spaces. Current tracking applications require us to consider complex models for objects (articulated objects, multiple objects, multiple fragments, etc.) as well as multiple kinds of information (multiple cameras, multiple modalities, etc.). This book presents some recent research that considers the main bottleneck of particle filtering frameworks (high dimensional state spaces) for tracking in such difficult conditions.

eBook

Share this book

Add to My Shelf

A computational perspective on visual attention

by Tsotsos, John K. in Attention -- Mathematical models , Computer vision -- Mathematical models , Vision

2011

Book

Share this book

Add to My Shelf

Visual Attention and Consciousness

by Friedenberg, Jay in Attention , Consciousness , Consciousness & Cognition

2013,2012

Consciousness is perhaps one of the greatest mysteries in the universe. This ambitious book begins with a philosophical approach to consciousness, examining some key questions such as what is meant by the term \"conscious,\" and how this applies to vision. The book then explores major visual phenomena related to attention and conscious experience-including filling-in processes, aftereffects, multi-stability, forms of divided attention, models of visual attention, priming effects, types of attentional blindness and various visual disorders. For each phenomenon, the biological and cognitive level research is reviewed. Themes touched upon throughout are the relation between consciousness and attention, automatic vs. willful processes, singularity vs. multiplicity, and looking without seeing. The book concludes with an evolutionary approach, describing possible functions that visual consciousness may serve and how those may affect the way we see. The systematic review of key topics and the multitude of perspectives make this book an ideal primary or ancillary text for graduate courses in perception, vision, consciousness, or philosophy of mind.

eBook

Share this book

Add to My Shelf

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

by Hata, Kenji , Li, Li-Jia , Zhu, Yuke in Analysis , Annotations , Artificial Intelligence

2017

Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks that involve not just recognizing, but reasoning about our visual world. However, models used to tackle the rich content in images for cognitive tasks are still being trained using the same datasets designed for perceptual tasks. To achieve success at cognitive tasks, models need to understand the interactions and relationships between objects in an image. When asked “What vehicle is the person riding?”, computers will need to identify the objects in an image as well as the relationships riding(man, carriage) and pulling(horse, carriage) to answer correctly that “the person is riding a horse-drawn carriage.” In this paper, we present the Visual Genome dataset to enable the modeling of such relationships. We collect dense annotations of objects, attributes, and relationships within each image to learn these models. Specifically, our dataset contains over 108K images where each image has an average of 35 objects, 26 attributes, and 21 pairwise relationships between objects. We canonicalize the objects, attributes, relationships, and noun phrases in region descriptions and questions answer pairs to WordNet synsets. Together, these annotations represent the densest and largest dataset of image descriptions, objects, attributes, relationships, and question answer pairs.

Journal Article

Share this book

Add to My Shelf

A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems

by Kröger, Thorben , Andres, Bjoern , Kappes, Jörg H. in Algorithms , Analysis , Artificial Intelligence

2015

Szeliski et al. published an influential study in 2006 on energy minimization methods for Markov random fields. This study provided valuable insights in choosing the best optimization technique for certain classes of problems. While these insights remain generally useful today, the phenomenal success of random field models means that the kinds of inference problems that have to be solved changed significantly. Specifically, the models today often include higher order interactions, flexible connectivity structures, large label-spaces of different cardinalities, or learned energy tables. To reflect these changes, we provide a modernized and enlarged study. We present an empirical comparison of more than 27 state-of-the-art optimization techniques on a corpus of 2453 energy minimization instances from diverse applications in computer vision. To ensure reproducibility, we evaluate all methods in the OpenGM 2 framework and report extensive results regarding runtime and solution quality. Key insights from our study agree with the results of Szeliski et al. for the types of models they studied. However, on new and challenging types of models our findings disagree and suggest that polyhedral methods and integer programming solvers are competitive in terms of runtime and solution quality over a large range of model types.

Journal Article

Share this book

Add to My Shelf

Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision

by Meng, Deyu , Zuo, Wangmeng , Feng, Xiangchu in Artificial Intelligence , Computer Imaging , Computer Science

2017

As a convex relaxation of the rank minimization model, the nuclear norm minimization (NNM) problem has been attracting significant research interest in recent years. The standard NNM regularizes each singular value equally, composing an easily calculated convex norm. However, this restricts its capability and flexibility in dealing with many practical problems, where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, which adaptively assigns weights on different singular values. As the key step of solving general WNNM models, the theoretical properties of the weighted nuclear norm proximal (WNNP) operator are investigated. Albeit nonconvex, we prove that WNNP is equivalent to a standard quadratic programming problem with linear constrains, which facilitates solving the original problem with off-the-shelf convex optimization solvers. In particular, when the weights are sorted in a non-descending order, its optimal solution can be easily obtained in closed-form. With WNNP, the solving strategies for multiple extensions of WNNM, including robust PCA and matrix completion, can be readily constructed under the alternating direction method of multipliers paradigm. Furthermore, inspired by the reweighted sparse coding scheme, we present an automatic weight setting method, which greatly facilitates the practical implementation of WNNM. The proposed WNNM methods achieve state-of-the-art performance in typical low level vision tasks, including image denoising, background subtraction and image inpainting.

Journal Article

Share this book

Add to My Shelf

SEEDS: Superpixels Extracted Via Energy-Driven Sampling

by Van den Bergh, Michael , Van Gool, Luc , Boix, Xavier in Algorithms , Analysis , Artificial Intelligence

2015

Superpixel algorithms aim to over-segment the image by grouping pixels that belong to the same object. Many state-of-the-art superpixel algorithms rely on minimizing objective functions to enforce color homogeneity. The optimization is accomplished by sophisticated methods that progressively build the superpixels, typically by adding cuts or growing superpixels. As a result, they are computationally too expensive for real-time applications. We introduce a new approach based on a simple hill-climbing optimization. Starting from an initial superpixel partitioning, it continuously refines the superpixels by modifying the boundaries. We define a robust and fast to evaluate energy function, based on enforcing color similarity between the boundaries and the superpixel color histogram. In a series of experiments, we show that we achieve an excellent compromise between accuracy and efficiency. We are able to achieve a performance comparable to the state-of-the-art, but in real-time on a single Intel i7 CPU at 2.8 GHz.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter