Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
2 result(s) for "OpenCL/CUDA"
Sort by:
A study of graphics hardware accelerated particle swarm optimization with digital pheromones
Programmable Graphics Processing Units (GPUs) have lately become a promising means to perform scientific computations. Modern GPUs have proven to outperform the number of floating point operations when compared to traditional Central Processing Units (CPUs) through inherent data parallel architecture and higher bandwidth capabilities. They allow scientific computations to be performed without noticeable degradation in accuracy in a fraction of the time compared to traditional CPUs at substantially reduced costs, making them viable alternatives to expensive computer clusters or workstations. GPU programmability however, has fostered the development of a variety of programming languages making it challenging to select a computing language and use it consistently without the pitfall of being obsolete. Some GPU languages are hardware specific and are designed to rake in performance boosts when used with their host GPUs (e.g., Nvidia Cuda). Others are operating system specific (e.g., Microsoft HLSL). A few are platform agnostic lending themselves to be used on a workstation with any CPU and a GPU (e.g., GLSL, OpenCL). Of a number of companies and organizations that implement formal optimization into their processes, only a few utilize GPUs. It is either because the others are either vested much into CPU based computing or they are not fully aware of the benefits of implementing population based optimization routines in GPUs. Literature shows a large number of research publications specifically in the field of optimization utilizing GPUs. However, most of them are limited to a specific GPU hardware or addressed specific problems. The diversity in current GPU hardware and software APIs present overwhelming number of choices making it challenging to decide where and how to begin transitioning to GPU based computing, impeding promising computing avenues that relatively is very cost effective. In this paper, the authors precisely intend to address some of these issues by broadly classifying GPU APIs into three categories: 1) Hardware vendor dependent GPU APIs, 2) Graphical in context APIs, and 3) Platform agnostic APIs. Prior work by the authors demonstrated the capability of digital pheromones within Particle Swarm Optimization (PSO) for searching n-dimensional design spaces with improved accuracy, efficiency and reliability in serial and parallel CPU computing environments. To study the impact of GPUs, the authors have taken this digital pheromone variant of PSO and implemented it on three GPU APIs, each representing a category listed above, in a simplistic sense – delegate unconstrained explicit objective function evaluations to GPUs. While this approach itself cannot be considered novel, the takeaways from implementing it on different GPU APIs provided a wealth of information that the authors believe can help optimization companies and organizations make informed decisions in implementing GPUs in their processes.
Hybrid/Heterogeneous Programming with OMPSS and Its Software/Hardware Implications
This chapter describes how OmpSs extends the OpenMP 3.0 node programming model and how it leverages message passing interface (MPI) and OpenCL/CUDA, mastering the efficient programming of the clustered heterogeneous multi‐/many‐core systems that will be available in current and future computing systems. It describes the language extensions and the implementation of OmpSs, focusing on the intelligence that needs to be embedded in the runtime system to effectively lower the programmability wall and the opportunities to implement new mechanisms and policies. The chapter reasons about the overheads related with task management (detecting intertask data dependencies, identifying task‐level parallelism and executing tasks out of order) in OmpSs examining how far a software implementation can go to cope with fine‐grain parallelism and opening the door to novel hardware mechanisms for emerging multicore architectures. The chapter provides a brief description of the OmpSs execution model to understand the programming model extensions.