Works (9)

2019 journal article

Coordinated CTA Combination and Bandwidth Partitioning for GPU Concurrent Kernel Execution

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(3).

By: Z. Lin, H. Dai, M. Mantor & H. Zhou

Sources: Web Of Science, ORCID
Added: December 2, 2019

2019 article

Exploring Memory Persistency Models for GPUs

2019 28TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2019), pp. 310–322.

By: Z. Lin, M. Alshboul, Y. Solihin & H. Zhou

Sources: Web Of Science, ORCID
Added: August 10, 2020

2019 article

Scatter-and-Gather Revisited: High-Performance Side-Channel-Resistant AES on GPUs

12TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 12), pp. 2–11.

By: Z. Lin, U. Mathur & H. Zhou

Sources: Web Of Science, ORCID
Added: July 22, 2019

2018 conference paper

Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

Source: ORCID
Added: September 22, 2019

2018 journal article

GPU performance vs. thread-level parallelism: Scalability analysis and a novel way to improve TLP

ACM Transactions on Architecture and Code Optimization, 15(1).

By: Z. Lin, M. Mantor & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 conference paper

Enabling efficient preemption for SIMT architectures with lightweight context switching

SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 898–908.

By: Z. Lin, L. Nyland & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 conference paper

Automatic data placement into GPU on-chip memory resources

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 23–33.

By: C. Li, Y. Yang, Z. Lin & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 chapter

GLES: A Practical GPGPU Optimizing Compiler Using Data Sharing and Thread Coarsening

In Languages and Compilers for Parallel Computing.

Source: ORCID
Added: September 22, 2019

2014 conference paper

Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems

Proceedings of 5th Asia-Pacific Workshop on Systems - APSys '14.

Source: ORCID
Added: September 22, 2019