Catalogue Search | MBRL

Language Support for Multi-Paradigm and Multi-Grain Parallelism on Smp-Cluster

by Wang, J. , Li, J. , Hu, C. in Data processing , Distributed shared memory , Efficiency

2007

The characteristics of large-scale parallel applications are multi-paradigm and multi-grain parallel in essence. The key factor in improving the performance of parallel application systems is to determine suitable parallel paradigms and grains according to the nature of the practical problem. Therefore, it is necessary to provide multi-paradigm and multi-grain parallel programming interface for development of large-scale parallel application systems. This paper proposes a multi-paradigm and multi-grain parallel execution model integrated coarse-grain parallelism (paralleled by macro tasks), mid-grain parallelism (paralleled by basic program blocks), and fine-grain parallelism (paralleled in repetition blocks). This model also supports the task parallel, data parallel, and sequential executing. In this paper we also discuss the programming mechanism of this model by extended OpenMP specification. The extensions include computing resource partition, defining different grain task groups, mapping from task groups to the respective processor groups, out-of-core computing, asynchronous parallel I/O , and definition of sequential relationship of tasks. We compare the performance of different implementations of benchmark, using the same numerical algorithm but employing different programming approaches, including MPI, MPI+OpenMP, and our extended OpenMP. We also discuss a case based on SMP-Cluster and network storage architecture.

Journal Article

Share this book

Add to My Shelf

Parallel programming of an ionic floating-gate memory array for scalable neuromorphic computing

by James, Conrad D. , Fuller, Elliot J. , Keene, Scott T. in Analog circuits , Arrays , Artificial neural networks

2019

Neuromorphic computers could overcome efficiency bottlenecks inherent to conventional computing through parallel programming and readout of artificial neural network weights in a crossbar memory array. However, selective and linear weight updates and < 10-nanoampere read currents are required for learning that surpasses conventional computing efficiency. We introduce an ionic floating-gate memory array based on a polymer redox transistor connected to a conductive-bridge memory (CBM). Selective and linear programming of a redox transistor array is executed in parallel by overcoming the bridging threshold voltage of the CBMs. Synaptic weight readout with currents < 10 nanoamperes is achieved by diluting the conductive polymer with an insulator to decrease the conductance. The redox transistors endure >1 billion write-read operations and support 1-megahertz write-read frequencies.

Journal Article

Share this book

Add to My Shelf

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

by Bobák, Martin , Tran, Viet , Dlugolinsky, Stefan in Algorithms , Artificial intelligence , Big Data

2019

The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

Journal Article

Share this book

Add to My Shelf

A universal parallel simulation framework for energy pipeline networks on high-performance computers

by Shang, Jiandong , Hua, Haobo , Wang, Hai in Compilers , Computer Science , Computer simulation

2024

Energy distribution networks represent crucial infrastructures for modern society, and various simulation tools have been widely used by energy suppliers to manage these intricate networks. However, simulation calculations include a large number of fluid control equations, and computational overhead limits the performance of simulation software. This paper proposes a universal parallel simulation framework for energy pipeline networks that takes advantages of data parallelism and computational independence between network elements. A non-pipe model of an energy supply network is optimized, and the input and output of the network model in the proposed framework are modified, which can reduce the development burden during the numerical computations of the pipeline network and weaken the computational correlation between different simulated components. In addition, independent computations can be performed concurrently through periodic data exchange procedures between component instances, improving the parallelism and efficiency of simulation computations. Further, a parallel water pipelines network simulation computing paradigm based on a heterogeneous computer hardware architecture is used to evaluate the proposed framework’s performance. A series of tests are conducted to verify the accuracy of the proposed framework, and simulation errors of less than 5% are achieved. The results of multi-threaded simulation experiments have demonstrated the feasibility of the proposed framework in a parallel computing approach. Moreover, an Advanced Micro Devices (AMD) Deep Computing Unit (DCU)-parallel program is implemented into a water supply network simulation system; the computational efficiency of this system is compared with that of its serial counterpart. The experimental results show that the proposed framework is appropriate for high-performance computer architectures, and the 18x speed-up ratio demonstrates that the parallel program based on the proposed universal framework outperforms the serial program. That provides the basis for the application of pipe network simulation on high-performance computers.

Journal Article

Share this book

Add to My Shelf

A vehicle to vehicle relay-based task offloading scheme in Vehicular Communication Networks

by Ahmad, Shahbaz , Ayzed Mirza, Muhammad , Asif, Muhammad in Algorithms , Cloud computing , Communication

2021

Vehicular edge computing (VEC) is a potential field that distributes computational tasks between VEC servers and local vehicular terminals, hence improve vehicular services. At present, vehicles’ intelligence and capabilities are rapidly improving, which will likely support many new and exciting applications. The network resources are well-utilized by exploiting neighboring vehicles’ available resources while mitigating the VEC server’s heavy burden. However, due to the vehicles’ mobility, network topology, and the available computing resources change rapidly, which are difficult to predict. To tackle this problem, we investigate the task offloading schemes by utilizing vehicle to vehicle and vehicle to infrastructure communication modes and exploiting the vehicle’s under-utilized computation and communication resources, and taking the cost and time consumption into account. We present a promising relay task-offloading scheme in vehicular edge computing (RVEC). According to this scheme, the tasks are offloaded in a vehicle to vehicle relay for computation while being transmitted to VEC servers. Numerical results illustrate that the RVEC scheme substantially enhances the network’s overall offloading cost.

Journal Article

Share this book

Add to My Shelf

Parallel numerical simulation of the 2D acoustic wave equation

by Altybay, Arshyn , Darkenbayev, Dauren , Mekebayev, Nurbapa

2024

Mathematical simulation has significantly broadened with the advancement of parallel computing, particularly in its capacity to comprehend physical phenomena across extensive temporal and spatial dimensions. High-performance parallel computing finds extensive application across diverse domains of technology and science, including the realm of acoustics. This research investigates the numerical modeling and parallel processing of the two-dimensional acoustic wave equation in both uniform and non-uniform media. Our approach employs implicit difference schemes, with the cyclic reduction algorithm used to obtain an approximate solution. We then adapt the sequential algorithm for parallel execution on a graphics processing unit (GPU). Ultimately, our findings demonstrate the effectiveness of the parallel approach in yielding favorable results.

Journal Article

Share this book

Add to My Shelf

High performance computers: from parallel computing to quantum computers and biocomputers

by Kopyltsov, A , Yerlanova, G , Serik, M in Adenine , Chains , Computers

2021

Various programming methods are considered. Particular attention is paid to parallel programming, quantum computers and biocomputers. This attention is due to the fact that in recent years, high-performance computing has been intensively developing. One of the main ideas for increasing the speed of information processing is to carry out calculations in parallel. For classical programming methods this is achieved thanks to the advent of multiprocessor computers. Such computers allow computational tasks to be parallelized by introducing parallelization elements into classical programming languages. Another approach to speed up computation is based on the idea of a quantum computer. The use of qubits in quantum computers leads to the fact that all possible states of the system are simultaneously processed. Another approach leading to increased computing performance is based on the development of biocomputers. This approach is based on the idea of using DNA chains consisting of a sequence of four nitrogenous bases (adenine, guanine, thymine, and cytosine). The information is stored and processed as a sequence of these nitrogenous bases. An increase in the speed of calculations is carried out due to the fact that biochemical reactions can take place simultaneously on different parts of the DNA - chains.

Journal Article

Share this book

Add to My Shelf

Deep learning model for deep fake face recognition and detection

by ST, Suganthi , Bacanin, Nebojsa , Pavel, Trojovský in Algorithms , Algorithms and Analysis of Algorithms , Analysis

2022

Deep Learning is an effective technique and used in various fields of natural language processing, computer vision, image processing and machine vision. Deep fakes uses deep learning technique to synthesis and manipulate image of a person in which human beings cannot distinguish the fake one. By using generative adversarial neural networks (GAN) deep fakes are generated which may threaten the public. Detecting deep fake image content plays a vital role. Many research works have been done in detection of deep fakes in image manipulation. The main issues in the existing techniques are inaccurate, consumption time is high. In this work we implement detecting of deep fake face image analysis using deep learning technique of fisherface using Local Binary Pattern Histogram (FF-LBPH). Fisherface algorithm is used to recognize the face by reduction of the dimension in the face space using LBPH. Then apply DBN with RBM for deep fake detection classifier. The public data sets used in this work are FFHQ, 100K-Faces DFFD, CASIA-WebFace.

Journal Article

Share this book

Add to My Shelf

PyGeNN: A Python Library for GPU-Enhanced Neural Networks

by Knight, James C. , Nowotny, Thomas , Komissarov, Anton in benchmarking , C plus plus , Computational neuroscience

2021

More than half of the Top 10 supercomputing sites worldwide use GPU accelerators and they are becoming ubiquitous in workstations and edge computing devices. GeNN is a C++ library for generating efficient spiking neural network simulation code for GPUs. However, until now, the full flexibility of GeNN could only be harnessed by writing model descriptions and simulation code in C++. Here we present PyGeNN, a Python package which exposes all of GeNN's functionality to Python with minimal overhead. This provides an alternative, arguably more user-friendly, way of using GeNN and allows modelers to use GeNN within the growing Python-based machine learning and computational neuroscience ecosystems. In addition, we demonstrate that, in both Python and C++ GeNN simulations, the overheads of recording spiking data can strongly affect runtimes and show how a new spike recording system can reduce these overheads by up to 10×. Using the new recording system, we demonstrate that by using PyGeNN on a modern GPU, we can simulate a full-scale model of a cortical column faster even than real-time neuromorphic systems. Finally, we show that long simulations of a smaller model with complex stimuli and a custom three-factor learning rule defined in PyGeNN can be simulated almost two orders of magnitude faster than real-time.

Journal Article

Share this book

Add to My Shelf

Advancements and Challenges in Handwritten Text Recognition: A Comprehensive Survey

by Heyberger, Laurent , Gechter, Franck , Guyeux, Christophe in 19th century , Algorithms , Analysis

2024

Handwritten Text Recognition (HTR) is essential for digitizing historical documents in different kinds of archives. In this study, we introduce a hybrid form archive written in French: the Belfort civil registers of births. The digitization of these historical documents is challenging due to their unique characteristics such as writing style variations, overlapped characters and words, and marginal annotations. The objective of this survey paper is to summarize research on handwritten text documents and provide research directions toward effectively transcribing this French dataset. To achieve this goal, we presented a brief survey of several modern and historical HTR offline systems of different international languages, and the top state-of-the-art contributions reported of the French language specifically. The survey classifies the HTR systems based on techniques employed, datasets used, publication years, and the level of recognition. Furthermore, an analysis of the systems’ accuracies is presented, highlighting the best-performing approach. We have also showcased the performance of some HTR commercial systems. In addition, this paper presents a summarization of the HTR datasets that publicly available, especially those identified as benchmark datasets in the International Conference on Document Analysis and Recognition (ICDAR) and the International Conference on Frontiers in Handwriting Recognition (ICFHR) competitions. This paper, therefore, presents updated state-of-the-art research in HTR and highlights new directions in the research field.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter