Works (62)

2020 journal article

Exploring Convolution Neural Network for Branch Prediction

IEEE Access, 8, 152008–152016.

By: Y. Mao, H. Zhou, X. Gui & J. Shen

Source: ORCID
Added: August 27, 2020

2019 journal article

Coordinated CTA Combination and Bandwidth Partitioning for GPU Concurrent Kernel Execution

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(3).

By: Z. Lin, H. Dai, M. Mantor & H. Zhou

Sources: Web Of Science, ORCID
Added: December 2, 2019

2019 journal article

Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation

IEEE Computer Architecture Letters, 18(2), 111–114.

By: H. Zhou & G. Byrd

Sources: Web Of Science, ORCID, Crossref
Added: September 23, 2019

2019 article

Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation

Liu, J., Byrd, G., & Zhou, H. (2019, December 9).

By: J. Liu, G. Byrd & H. Zhou

Source: ORCID
Added: December 30, 2019

2019 article

Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation

Liu, J., Byrd, G., & Zhou, H. (2019, December 9).

By: J. Liu, G. Byrd & H. Zhou

Source: ORCID
Added: December 30, 2019

2019 article

Scatter-and-Gather Revisited: High-Performance Side-Channel-Resistant AES on GPUs

12TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 12), pp. 2–11.

By: Z. Lin, U. Mathur & H. Zhou

Sources: Web Of Science, ORCID
Added: July 22, 2019

2018 journal article

Developing Noise-Resistant Three-Dimensional Single Particle Tracking Using Deep Neural Networks

ANALYTICAL CHEMISTRY, 90(18), 10748–10757.

By: Y. Zhong, C. Li, H. Zhou & G. Wang

Sources: NC State University Libraries, ORCID
Added: October 16, 2018

2018 journal article

GPU performance vs. thread-level parallelism: Scalability analysis and a novel way to improve TLP

ACM Transactions on Architecture and Code Optimization, 15(1).

By: Z. Lin, M. Mantor & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2017 conference paper

Developing dynamic profiling and debugging support in OpenCL for FPGAs

Proceedings of the 2017 54th acm/edac/ieee design automation conference (dac).

By: A. Verma, H. Zhou, S. Booth, R. King, J. Coole, A. Keep, J. Marshall, W. Feng

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2017 journal article

Methylation specific targeting of a chromatin remodeling complex from sponges to humans

Scientific Reports, 7.

By: J. Cramer, D. Pohlmann, F. Gomez, L. Mark, B. Kornegay, C. Hall, E. Siraliev-Perez, N. Walavalkar ...

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 journal article

A Cross-Platform SpMV Framework on Many-Core Architectures

ACM Transactions on Architecture and Code Optimization, 13(4), 1–25.

By: Y. Zhang, S. Li, S. Yan & H. Zhou

Sources: Crossref, ORCID
Added: January 28, 2020

2016 conference paper

A model-driven approach to warp/thread-block level CPU cache bypassing

2016 53rd acm/edac/ieee design automation conference (dac).

By: H. Dai, C. Li, H. Zhou, S. Gupta, C. Kartsaklis & M. Mantor

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 conference paper

Enabling efficient preemption for SIMT architectures with lightweight context switching

SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 898–908.

By: Z. Lin, L. Nyland & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 conference paper

Opencl-based erasure coding on heterogeneous architectures

Ieee international conference on application-specific systems, 33–40.

By: G. Chen, H. Zhou, X. Shen, J. Gahm, N. Venkat, S. Booth, J. Marshall

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 conference paper

Optimizing memory efficiency for deep convolutional neural networks on GPUs

SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 633–644.

By: C. Li, Y. Yang, M. Feng, S. Chakradhar & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2016 conference paper

Tuning stencil codes in opencl for fpgas

Proceedings of the 34th ieee international conference on computer design (iccd), 249–256.

By: Q. Jia & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 conference paper

Analyzing graphics processor unit (GPU) instruction set architectures

Ieee international symposium on performance analysis of systems and, 155–156.

By: K. Mayank, H. Dai, J. Wei & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 conference paper

Automatic data placement into GPU on-chip memory resources

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 23–33.

By: C. Li, Y. Yang, Z. Lin & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 journal article

CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications

Journal of Computer Science and Technology, 30(1), 3–19.

By: Y. Yang, C. Li & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2015 conference paper

Spatial locality-aware cache partitioning for effective cache sharing

2015 44th international conference on parallel processing (icpp), 150–159.

By: S. Gupta & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2014 chapter

A Highly Efficient FFT Using Shared-Memory Multiplexing

In Numerical Computations with GPUs (pp. 363–377).

By: Y. Yang & H. Zhou

Sources: Crossref, ORCID
Added: January 28, 2020

2014 journal article

CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications

ACM SIGPLAN Notices, 49(8), 93–105.

By: Y. Yang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2014 conference paper

Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs

Ieee international symposium on performance analysis of systems and, 231–241.

By: C. Li, Y. Yang, H. Dai, S. Yan, F. Mueller & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2014 conference paper

Warp-level divergence in GPUs: Characterization, impact, and mitigation

International symposium on high-performance computer, 284–295.

By: P. Xiang, Y. Yang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2014 journal article

yaSpMV: Yet another SpMV framework on GPUs

ACM SIGPLAN Notices, 49(8), 107–118.

By: S. Yan, C. Li, Y. Zhang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 conference paper

Adaptive cache bypassing for inclusive last level caches

Ieee 27th international parallel and distributed processing symposium (ipdps 2013), 1243–1253.

By: S. Gupta, H. Gao & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 journal article

Architecting against software cache-based side-channel attacks

IEEE Transactions on Computers, 62(7), 1276–1288.

By: J. Kong, O. Aciicmez, J. Seifert & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 journal article

Locality principle revisited: A probability-based quantitative approach

Journal of Parallel and Distributed Computing, 73(7), 1011–1027.

By: S. Gupta, P. Xiang, Y. Yang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 journal article

The implementation of a high performance GPGPU compiler

International Journal of Parallel Programming, 41(6), 768–781.

By: Y. Yang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2012 journal article

A unified optimizing compiler framework for different GPGPU architectures

ACM Transactions on Architecture and Code Optimization, 9(2).

By: Y. Yang, P. Xiang, J. Kong, M. Mantor & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2012 conference paper

CPU-assisted GPGPU on fused CPU-GPU architectures

International symposium on high-performance computer, 103–114.

By: Y. Yang, P. Xiang, M. Mantor & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2012 conference paper

Fixing Performance Bugs: An Empirical Study of Open-Source GPGPU Programs

2012 41st International Conference on Parallel Processing. Presented at the 2012 41st International Conference on Parallel Processing (ICPP).

By: Y. Yang, P. Xiang, M. Mantor & H. Zhou

Event: 2012 41st International Conference on Parallel Processing (ICPP)

Sources: Crossref, ORCID
Added: January 28, 2020

2012 conference paper

Locality principle revisited: A probability-based quantitative approach

2012 ieee 26th international parallel and distributed processing symposium (ipdps), 995–1009.

By: S. Gupta, P. Xiang, Y. Yang & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2011 journal article

Combining local and global history for high performance data prefetching

Journal of Instruction-Level Parallelism, 13, 1–14.

By: M. Dimitrov & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2011 conference paper

Time-Ordered Event Traces: A New Debugging Primitive for Concurrency Bugs

2011 IEEE International Parallel & Distributed Processing Symposium. Presented at the Distributed Processing Symposium (IPDPS).

By: M. Dimitrov & H. Zhou

Event: Distributed Processing Symposium (IPDPS)

Sources: Crossref, ORCID
Added: January 28, 2020

2010 conference paper

A GPGPU compiler for memory optimization and parallelism management

ACM SIGPLAN Notices, 45(6), 86–97.

By: Y. Yang, P. Xiang, J. Kong & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2010 journal article

An optimizing compiler for GPGPU programs with input-data sharing

ACM SIGPLAN Notices, 45(5), 343–344.

By: Y. Yang, P. Xiang, J. Kong & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2010 conference paper

An optimizing compiler for GPGPU programs with input-data sharing

ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 343–344.

By: Y. Yang, P. Xiang, J. Kong & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2010 conference paper

Improving privacy and lifetime of PCM-based main memory

2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN). Presented at the Networks (DSN).

By: J. Kong & H. Zhou

Event: Networks (DSN)

Sources: Crossref, ORCID
Added: January 28, 2020

2009 conference paper

Anomaly-based bug prediction, isolation, and validation

Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09. Presented at the Proceeding of the 14th international conference.

By: M. Dimitrov & H. Zhou

Event: Proceeding of the 14th international conference

Sources: Crossref, ORCID
Added: January 28, 2020

2009 conference paper

Hardware-software integrated approaches to defend against software cache-based side channel attacks

2009 IEEE 15th International Symposium on High Performance Computer Architecture. Presented at the HPCA - 15 2009. IEEE 15th International Symposium on High Performance Computer Architecture.

By: J. Kong, O. Aciicmez, J. Seifert & H. Zhou

Event: HPCA - 15 2009. IEEE 15th International Symposium on High Performance Computer Architecture

Sources: Crossref, ORCID
Added: January 28, 2020

2009 conference paper

Understanding software approaches for GPGPU reliability

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-2. Presented at the 2nd Workshop.

By: M. Dimitrov, M. Mantor & H. Zhou

Event: 2nd Workshop

Sources: Crossref, ORCID
Added: January 28, 2020

2008 conference paper

Address-branch correlation: A novel locality for long-latency hard-to-predict branches

2008 IEEE 14th International Symposium on High Performance Computer Architecture. Presented at the 2008 IEEE 14th International Symposium on High Performance Computer Architecture (HPCA).

By: H. Gao, Y. Ma, M. Dimitrov & H. Zhou

Event: 2008 IEEE 14th International Symposium on High Performance Computer Architecture (HPCA)

Sources: Crossref, ORCID
Added: January 28, 2020

2008 conference paper

Deconstructing new cache designs for thwarting software cache-based side channel attacks

Proceedings of the 2nd ACM workshop on Computer security architectures - CSAW '08. Presented at the the 2nd ACM workshop.

By: J. Kong, O. Aciicmez, J. Seifert & H. Zhou

Event: the 2nd ACM workshop

Sources: Crossref, ORCID
Added: January 28, 2020

2007 journal article

Optimizing dual-core execution for power efficiency and transient-fault recovery

IEEE Transactions on Parallel and Distributed Systems, 18(8), 1080–1093.

By: Y. Ma, H. Gao, M. Dimitrov & H. Zhou

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2007 journal article

PMPM: Prediction by combining multiple partial matches

Journal of Instruction-Level Parallelism, 9, 1–18.

By: H. Gao & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2007 conference paper

Unified Architectural Support for Soft-Error Protection or Software Bug Detection

16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007). Presented at the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

By: M. Dimitrov & H. Zhou

Event: 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007)

Sources: Crossref, ORCID
Added: January 28, 2020

2006 conference paper

Efficient Transient-Fault Tolerance for Multithreaded Processors Using Dual-Thread Execution

2006 International Conference on Computer Design. Presented at the 2006 International Conference on Computer Design.

By: Y. Ma & H. Zhou

Event: 2006 International Conference on Computer Design

Sources: Crossref, ORCID
Added: January 28, 2020

2006 conference paper

Improving software security via runtime instruction-level taint checking

Proceedings of the 1st workshop on Architectural and system support for improving software dependability - ASID '06. Presented at the the 1st workshop.

By: J. Kong, C. Zou & H. Zhou

Event: the 1st workshop

Sources: Crossref, ORCID
Added: January 28, 2020

2006 journal article

Using index functions to reduce conflict aliasing in branch prediction tables

IEEE Transactions on Computers, 55(8), 1057–1061.

By: G. Ma Y. & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2005 journal article

A case for fault tolerance and performance enhancement using chip multi-processors

IEEE Computer Architecture Letters, 4, 1–4.

By: H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2005 journal article

Adaptive information processing: an effective way to improve perceptron branch predictors

Journal of Instruction-Level Parallelism, 7, 1–10.

By: H. Gao & H. Zhou

Source: NC State University Libraries
Added: August 6, 2018

2005 conference paper

Code size efficiency in global scheduling for ILP processors

Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures. Presented at the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

By: H. Zhou & T. Conte

Event: Sixth Annual Workshop on Interaction between Compilers and Computer Architectures

Sources: Crossref, ORCID
Added: January 28, 2020

2005 conference paper

Detecting global stride locality in value streams

30th Annual International Symposium on Computer Architecture, 2003. Proceedings. Presented at the ISCA 2003: 30th International Symposium on Computer Architecture.

By: H. Zhou, J. Flanagan & T. Conte

Event: ISCA 2003: 30th International Symposium on Computer Architecture

Sources: Crossref, ORCID
Added: January 28, 2020

2005 conference paper

Dual-core execution: building a highly scalable single-thread instruction window

14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). Presented at the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

By: H. Zhou

Event: 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)

Sources: Crossref, ORCID
Added: January 28, 2020

2005 journal article

Enhancing memory-level parallelism via recovery-free value prediction

IEEE Transactions on Computers, 54, 897–912.

By: H. Zhou & T. Conte

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2003 journal article

Adaptive mode control: A static-power-efficient cache design

ACM Transactions on Embedded Computing Systems, 2(3), 347–372.

By: H. Zhou, M. Toburen, E. Rotenberg & T. Conte

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2003 chapter

Tree Traversal Scheduling: A Global Instruction Scheduling Technique for VLIW/EPIC Processors

In Languages and Compilers for Parallel Computing (Vol. 2624, pp. 223–238).

By: H. Zhou, M. Jennings & T. Conte

Sources: NC State University Libraries, ORCID, Crossref
Added: August 6, 2018

2001 conference paper

Adaptive mode control: A static-power-efficient cache design

2001 International Conference on Parallel Architectures and Compilation Techniques: Proceedings: 8-12 September, 2001, Barcelona, Catalunya, Spain, 61–70.

By: H. Zhou, M. Toburen, E. Rotenberg & T. Conte

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2000 journal article

Automatic IC orientation checks

Machine Vision and Applications, 12(3), 107–112.

By: A. Kassim, H. Zhou & S. Raganath

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

1998 journal article

A fast algorithm for detecting die extrusion defects in IC packages

Machine Vision and Applications, 11(1), 37–41.

By: H. Zhou, A. Kassim & S. Ranganath

Sources: NC State University Libraries, ORCID
Added: August 6, 2018

1996 journal article

Test sequencing and diagnosis in electrical system with decision table

Microelectronics and Reliability, 36(9), 1167–1175.

By: H. Zhou, L. Qu & A. Li

Sources: NC State University Libraries, ORCID
Added: August 6, 2018