Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Series Title
      Series Title
      Clear All
      Series Title
  • Reading Level
      Reading Level
      Clear All
      Reading Level
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Content Type
    • Item Type
    • Is Full-Text Available
    • Subject
    • Publisher
    • Source
    • Donor
    • Language
    • Place of Publication
    • Contributors
    • Location
4,227 result(s) for "Parallel programming"
Sort by:
Assessing Application Efficiency and Performance Portability in Single-Source Programming for Heterogeneous Parallel Systems
We analyze the performance portability of the skeleton-based, single-source multi-backend high-level programming framework SkePU across multiple different CPU–GPU heterogeneous systems. Thereby, we provide a systematic application efficiency characterization of SkePU-generated code in comparison to equivalent hand-written code in more low-level parallel programming models such as OpenMP and CUDA. For this purpose, we contribute ports of the STREAM benchmark suite and of a part of the NAS Parallel Benchmark suite to SkePU. We show that for STREAM and the EP benchmark, SkePU regularly scores efficiency values above 80% and in particular for CPU systems, SkePU can outperform hand-written code.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform
The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator’s dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
Pro TBB : C++ parallel programming with threading building blocks
This open access book is a modern guide for all C++ programmers to learn Threading Building Blocks (TBB). Written by TBB and parallel programming experts, this book reflects their collective decades of experience in developing and teaching parallel programming with TBB, offering their insights in an approachable manner. Throughout the book the authors present numerous examples and best practices to help you become an effective TBB programmer and leverage the power of parallel systems.
Correct program parallelisations
A commonly used approach to develop deterministic parallel programs is to augment a sequential program with compiler directives that indicate which program blocks may potentially be executed in parallel. This paper develops a verification technique to reason about such compiler directives, in particular to show that they do not change the behaviour of the program. Moreover, the verification technique is tool-supported and can be combined with proving functional correctness of the program. To develop our verification technique, we propose a simple intermediate representation (syntax and semantics) that captures the main forms of deterministic parallel programs. This language distinguishes three kinds of basic blocks: parallel, vectorised and sequential blocks, which can be composed using three different composition operators: sequential, parallel and fusion composition. We show how a widely used subset of OpenMP can be encoded into this intermediate representation. Our verification technique builds on the notion of iteration contract to specify the behaviour of basic blocks; we show that if iteration contracts are manually specified for single blocks, then that is sufficient to automatically reason about data race freedom of the composed program. Moreover, we also show that it is sufficient to establish functional correctness on a linearised version of the original program to conclude functional correctness of the parallel program. Finally, we exemplify our approach on an example OpenMP program, and we discuss how tool support is provided.
Programming big data analysis: principles and solutions
In the age of the Internet of Things and social media platforms, huge amounts of digital data are generated by and collected from many sources, including sensors, mobile devices, wearable trackers and security cameras. This data, commonly referred to as Big Data, is challenging current storage, processing, and analysis capabilities. New models, languages, systems and algorithms continue to be developed to effectively collect, store, analyze and learn from Big Data. Most of the recent surveys provide a global analysis of the tools that are used in the main phases of Big Data management (generation, acquisition, storage, querying and visualization of data). Differently, this work analyzes and reviews parallel and distributed paradigms, languages and systems used today to analyze and learn from Big Data on scalable computers. In particular, we provide an in-depth analysis of the properties of the main parallel programming paradigms (MapReduce, workflow, BSP, message passing, and SQL-like) and, through programming examples, we describe the most used systems for Big Data analysis (e.g., Hadoop, Spark, and Storm). Furthermore, we discuss and compare the different systems by highlighting the main features of each of them, their diffusion (community of developers and users) and the main advantages and disadvantages of using them to implement Big Data analysis applications. The final goal of this work is to help designers and developers in identifying and selecting the best/appropriate programming solution based on their skills, hardware availability, application domains and purposes, and also considering the support provided by the developer community.
SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters
We present the third generation of the C++-based open-source skeleton programming framework SkePU. Its main new features include new skeletons, new data container types, support for returning multiple objects from skeleton instances and user functions, support for specifying alternative platform-specific user functions to exploit e.g. custom SIMD instructions, generalized scheduling variants for the multicore CPU backends, and a new cluster-backend targeting the custom MPI interface provided by the StarPU task-based runtime system. We have also revised the smart data containers’ memory consistency model for automatic data sharing between main and device memory. The new features are the result of a two-year co-design effort collecting feedback from HPC application partners in the EU H2020 project EXA2PRO, and target especially the HPC application domain and HPC platforms. We evaluate the performance effects of the new features on high-end multicore CPU and GPU systems and on HPC clusters.