MbrlCatalogueTitleDetail

Do you wish to reserve the book?
Improving reliability and performance of high performance computing applications
Improving reliability and performance of high performance computing applications
Hey, we have placed the reservation for you!
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Improving reliability and performance of high performance computing applications
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Title added to your shelf!
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Improving reliability and performance of high performance computing applications
Improving reliability and performance of high performance computing applications

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
How would you like to get it?
We have requested the book for you! Sorry the robot delivery is not available at the moment
We have requested the book for you!
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Improving reliability and performance of high performance computing applications
Improving reliability and performance of high performance computing applications
Dissertation

Improving reliability and performance of high performance computing applications

2015
Request Book From Autostore and Choose the Collection Method
Overview
Because of the growing popularity of parallel programming in multi-core/multiprocessor and multithreaded hardware, more and more applications are implemented in the well-written concurrent programming model. These programming models are MPI, OpenMP and Hybrid MPI/OpenMP. However, developing concurrent programs is extremely difficult. Concurrency introduces the possibility of errors that do not happen in traditional sequential programs, such as data race, deadlock and thread-safety issues. In addition, the performance issue of concurrent programs is another research area. This dissertation presents an integrated static and dynamic program analysis framework to address these concurrent issues in the OpenMP multithreaded application and hybrid OpenMP/MPI programming model. This dissertation also introduces the approach to reallocating the computing resources to improve the performance of MPI parallel programs in the container-based virtual cloud. First, we present the OpenMP Analysis Toolkit (OAT), which uses Satisfiability Modulo Theories (SMT) solver based symbolic analysis to detect data races and deadlocks in OpenMP applications. Our approach approximately simulates the real execution schedule of an OpenMP program through schedule permutation with partial order reduction to improve the analysis efficiency. We conducted experiments on real-world OpenMP benchmarks by comparing our OAT tool with two commercial dynamic analysis tools: Intel Thread Checker and Sun Thread Analyzer, and one commercial static analysis tool: Viva64 PVS Studio. The experiments show that our symbolic analysis approach is more accurate than static analysis and more efficient and scalable than dynamic analysis tools with less false positives and negatives. The second part of the dissertation proposes an approach by integrating static and dynamic program analyses to check thread-safety violations in hybrid MPI/OpenMP programs. We use an innovative method to transform the thread-safety violation problems in race conditions. In our approach, the static analysis identifies a list of MPI calls related to thread-safety violations, then replaces them with our own MPI wrappers, which access specific shared variables. The static analysis avoids instrumenting unrelated code, which significantly reduces runtime overhead. In the dynamic analysis, both happen-before and lockset-based race detection algorithms are used to check races on these aforementioned shared variables. By checking races, we can identify thread-safety violations according to their specifications. Our experimental evaluation over real-world applications shows that our approach is both accurate and efficient. Finally, the dissertation describes an approach that uses adaptive resource management enabled by container-based virtualization techniques to automatically tune performance of MPI programs in the cloud. Specifically, the containers running on physical hosts can dynamically allocate CPU resources to MPI processes according to the current program execution state and system resource status. High Performance Computing (HPC) in the cloud has great potential as an effective and convenient option for users to launch HPC applications. However, there are still many open problems to be solved in order for the cloud to be more amenable to HPC applications. In order to tune the performance of MPI applications during runtime, many traditional techniques try to balance the workloads by distributing datasets approximately equally to all computing nodes. However, the computing resource imbalance may still arise from data skew, and it is nontrivial to foresee such imbalances beforehand. The resource allocation among MPI processes are adjusted in two ways: the intra-host level, which dynamically adjusts resources within a host; and the inter-host level, which migrates containers together with MPI processes from one host to another host. We have implemented and evaluated our approach on the Amazon EC2 platform using real-world scientific benchmarks and applications, which demonstrates that the performance can be improved up to 31.1% (with an average of 15.6%) when comparing with the baseline of application runtime.
Publisher
ProQuest Dissertations & Theses
ISBN
1339054930, 9781339054933