Works (4)

2019 journal article

Coordinated CTA Combination and Bandwidth Partitioning for GPU Concurrent Kernel Execution

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(3).

By: Z. Lin, H. Dai, M. Mantor & H. Zhou

Source: Web Of Science
Added: December 2, 2019

2016 conference paper

A model-driven approach to warp/thread-block level CPU cache bypassing

2016 53rd acm/edac/ieee design automation conference (dac).

By: H. Dai, C. Li, H. Zhou, S. Gupta, C. Kartsaklis & M. Mantor

Source: NC State University Libraries
Added: August 6, 2018

2015 conference paper

Analyzing graphics processor unit (GPU) instruction set architectures

Ieee international symposium on performance analysis of systems and, 155–156.

By: K. Mayank, H. Dai, J. Wei & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2014 conference paper

Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs

Ieee international symposium on performance analysis of systems and, 231–241.

By: C. Li, Y. Yang, H. Dai, S. Yan, F. Mueller & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018