Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
CWLP: coordinated warp scheduling and locality-protected cache allocation on GPUs
by
Zhang, Yang
, Xing, Zuo-cheng
, Liu, Cang
, Tang, Chuan
in
Chips (memory devices)
/ Communications Engineering
/ Computation
/ Computer Hardware
/ Computer Science
/ Computer Systems Organization and Communication Networks
/ Electrical Engineering
/ Electronics and Microelectronics
/ Energy efficiency
/ Graphics processing units
/ Instrumentation
/ Memory management
/ Multiprocessing
/ Networks
/ Personal computers
/ Scheduling
/ Supercomputers
/ Warp
2018
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
CWLP: coordinated warp scheduling and locality-protected cache allocation on GPUs
by
Zhang, Yang
, Xing, Zuo-cheng
, Liu, Cang
, Tang, Chuan
in
Chips (memory devices)
/ Communications Engineering
/ Computation
/ Computer Hardware
/ Computer Science
/ Computer Systems Organization and Communication Networks
/ Electrical Engineering
/ Electronics and Microelectronics
/ Energy efficiency
/ Graphics processing units
/ Instrumentation
/ Memory management
/ Multiprocessing
/ Networks
/ Personal computers
/ Scheduling
/ Supercomputers
/ Warp
2018
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
CWLP: coordinated warp scheduling and locality-protected cache allocation on GPUs
by
Zhang, Yang
, Xing, Zuo-cheng
, Liu, Cang
, Tang, Chuan
in
Chips (memory devices)
/ Communications Engineering
/ Computation
/ Computer Hardware
/ Computer Science
/ Computer Systems Organization and Communication Networks
/ Electrical Engineering
/ Electronics and Microelectronics
/ Energy efficiency
/ Graphics processing units
/ Instrumentation
/ Memory management
/ Multiprocessing
/ Networks
/ Personal computers
/ Scheduling
/ Supercomputers
/ Warp
2018
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
CWLP: coordinated warp scheduling and locality-protected cache allocation on GPUs
Journal Article
CWLP: coordinated warp scheduling and locality-protected cache allocation on GPUs
2018
Request Book From Autostore
and Choose the Collection Method
Overview
As we approach the exascale era in supercomputing, designing a balanced computer system with a powerful computing ability and low power requirements has becoming increasingly important. The graphics processing unit (GPU) is an accelerator used widely in most of recent supercomputers. It adopts a large number of threads to hide a long latency with a high energy efficiency. In contrast to their powerful computing ability, GPUs have only a few megabytes of fast on-chip memory storage per streaming multiprocessor (SM). The GPU cache is inefficient due to a mismatch between the throughput-oriented execution model and cache hierarchy design. At the same time, current GPUs fail to handle burst-mode long-access latency due to GPU’s poor warp scheduling method. Thus, benefits of GPU’s high computing ability are reduced dramatically by the poor cache management and warp scheduling methods, which limit the system performance and energy efficiency. In this paper, we put forward a coordinated warp scheduling and locality-protected (CWLP) cache allocation scheme to make full use of data locality and hide latency. We first present a locality-protected cache allocation method based on the instruction program counter (LPC) to promote cache performance. Specifically, we use a PC-based locality detector to collect the reuse information of each cache line and employ a prioritised cache allocation unit (PCAU) which coordinates the data reuse information with the time-stamp information to evict the lines with the least reuse possibility. Moreover, the locality information is used by the warp scheduler to create an intelligent warp reordering scheme to capture locality and hide latency. Simulation results show that CWLP provides a speedup up to 19.8% and an average improvement of 8.8% over the baseline methods.
Publisher
Zhejiang University Press,Springer Nature B.V
This website uses cookies to ensure you get the best experience on our website.