Works (120)
2025 article
CoopRT: Accelerating BVH Traversal for Ray Tracing via Cooperative Threads
Tozlu, Y. S., & Zhou, H. (2025, June 20).
2025 article
CryptoBTB: A Secure Hierarchical BTB for Diverse Instruction Footprint Workloads
Adak, D., Rotenberg, E., Awad, A., & Zhou, H. (2025, October 17).
2025 article
Genesis: A Compiler for Hamiltonian Simulation on Hybrid CV-DV Quantum Computers
Chen, Z., Li, J., Guo, M., Chen, H., Li, Z., Bierman, J., … Zhang, E. Z. (2025, June 20).
2025 article
Q-Cluster: Quantum Error Mitigation Through Noise-Aware Unsupervised Learning
Patil, H. P., Baron, D., & Zhou, H. (2025, August 30).
2025 article
SpecMPK: Efficient In-Process Isolation with Speculative and Secure Permission Update Instruction
Adak, D., Zhou, H., Rotenberg, E., & Awad, A. (2025, March 1). 2025 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE, HPCA, pp. 394–408.
2024 article
BoostCom: Towards Efficient Universal Fully Homomorphic Encryption by Boosting the Word-wise Comparisons
Yudha, A. W. B., Xue, J., Lou, Q., Zhou, H., & Solihin, Y. (2024, October 11). PROCEEDINGS OF THE 2024 THE INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2024, pp. 121–132.
2024 article
Delta Counter: Bandwidth-Efficient Encryption Counter Representation for Secure GPU Memory
Yuan, S., Awad, A., & Zhou, H. (2024, April 16). IEEE Transactions on Dependable and Secure Computing, Vol. 22, pp. 101–113.
2024 article
Posters Program: 2024 IEEE International Conference on Quantum Computing and Engineering
Chen, F., & Zhou, H. (2024, September 15). 2024 IEEE INTERNATIONAL CONFERENCE ON QUANTUM COMPUTING AND ENGINEERING, QCE, VOL 2, pp. LXXIX-LXXIX.
2024 article
QuTracer: Mitigating Quantum Gate and Measurement Errors by Tracing Subsets of Qubits
Li, P., Liu, J., Gonzales, A., Saleem, Z. H., Zhou, H., & Hovland, P. (2024, June 29). 2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, pp. 103–117.
2024 article
SEFsim: A Statistically-Guided Fast DRAM Simulator
Adak, D., Lee, H., Feinberg, B., Voskuilen, G., Hughes, C., Zhou, H., & Awad, A. (2024, May 5). PROCEEDINGS OF THE 2024 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS 2024, pp. 304–306.
2024 article
Salus: Efficient Security Support for CXL-Expanded GPU Memory
Abdullah, R., Lee, H., Zhou, H., & Awad, A. (2024, March 2). 2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024, pp. 233–248.
2024 article
Tetris: A Compilation Framework for VQA Applications in Quantum Computing
Jin, Y., Li, Z., Hua, F., Hao, T., Zhou, H., Huang, Y., & Zhang, E. Z. (2024, June 29). 2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, pp. 277–292.
2023 article
An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory
Long, X., Gong, X., Zhang, B., & Zhou, H. (2023, February 14). Journal of Grid Computing, Vol. 21.
2023 article
Enhancing Virtual Distillation with Circuit Cutting for Quantum Error Mitigation
Li, P., Liu, J., Patil, H. P., Hovland, P., & Zhou, H. (2023, November 6). 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, pp. 94–101.
2023 article
PBVR: Physically Based Rendering in Virtual Reality
Tozlu, Y. S., & Zhou, H. (2023, October 1). 2023 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, IISWC, pp. 77–86.
2023 article
Plutus: Bandwidth-Efficient Memory Security for GPUs
Abdullah, R., Zhou, H., & Awad, A. (2023, February 1). 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, pp. 543–555.
2023 article
SecPB: Architectures for Secure Non-Volatile Memory with Battery-Backed Persist Buffers
Freij, A., Zhou, H., & Solihin, Y. (2023, February 1). 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, pp. 677–690.
2022 article
Adaptive Security Support for Heterogeneous Memory on GPUs
Yuan, S., Awad, A., Yudha, A. W. B., Solihin, Y., & Zhou, H. (2022, April 1). 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022).
2022 article
Deep learning based data prefetching in CPU-GPU unified virtual memory
Long, X., Gong, X., Zhang, B., & Zhou, H. (2022, December 12). Journal of Parallel and Distributed Computing, Vol. 174, pp. 19–31.
2022 conference paper
Exploiting Quantum Assertions for Error Mitigation and Quantum Program Debugging
2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 124–131.
2022 article
LITE
Yudha, A. W. B., Meyer, J., Yuan, S., Zhou, H., & Solihin, Y. (2022, June 16). PROCEEDINGS OF THE 36TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2022.
2022 article
Not All SWAPs Have the Same Cost: A Case for Optimization-Aware Qubit Routing
Liu, J., Li, P., & Zhou, H. (2022, April 1). 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022).
2021 article
A Survey of GPU Multitasking Methods Supported by Hardware Architecture
Zhao, C., Gao, W., Nie, F., & Zhou, H. (2021, September 27). IEEE Transactions on Parallel and Distributed Systems, Vol. 33, pp. 1451–1463.
2021 article
Analyzing Secure Memory Architecture for GPUs
Yuan, S., Yudha, A. W. B., Solihin, Y., & Zhou, H. (2021, March 1). 2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021).
2021 article
PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint
Ravi, J., Nguyen, T., Zhou, H., & Becchi, M. (2021, December 1). 2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021), Vol. 12, pp. 442–447.
2021 article
Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits
Liu, J., Bello, L., & Zhou, H. (2021, February 27). CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO).
2021 article
Systematic Approaches for Precise and Approximate Quantum State Runtime Assertion
Liu, J., & Zhou, H. (2021, February 1). 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021).
2020 article
Exploring Convolution Neural Network for Branch Prediction
Mao, Y., Zhou, H., Gui, X., & Shen, J. (2020, January 1). IEEE Access, Vol. 8, pp. 152008–152016.
2020 article
Fair and cache blocking aware warp scheduling for concurrent kernel execution on GPU
Zhao, C., Gao, W., Nie, F., Wang, F., & Zhou, H. (2020, May 21). Future Generation Computer Systems, Vol. 112, pp. 1093–1105.
2020 conference paper
Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation
ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, 1017–1030.
2020 article
Reliability Modeling of NISQ- Era Quantum Computers
Liu, J., & Zhou, H. (2020, October 1). 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020).
2020 article
Scalable and Fast Lazy Persistency on GPUs
Yudha, A. W. B., Kimura, K., Zhou, H., & Solihin, Y. (2020, October 1). 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020).
2019 article
Coordinated CTA Combination and Bandwidth Partitioning for GPU Concurrent Kernel Execution
Lin, Z., Dai, H., Mantor, M., & Zhou, H. (2019, June 17). ACM Transactions on Architecture and Code Optimization, Vol. 16.
2019 conference paper
Exploring Memory Persistency Models for GPUs
28th International Conference on Parallel Architectures and Compilation Techniques (PACT), 310–322.
2019 conference paper
In-Place Zero-Space Memory Protection for CNN
In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32). San Mateo, CA: Morgan Kaufmann Publishers.
Ed(s): H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox & R. Garnett
2019 journal article
Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation
IEEE Computer Architecture Letters, 18(2), 111–114.
Contributors: n & G. Byrd n
2019 article
Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation
Liu, J., Byrd, G., & Zhou, H. (2019, December 9).
2019 article
Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation
Liu, J., Byrd, G., & Zhou, H. (2019, December 9).
2019 article
Scatter-and-Gather Revisited
Lin, Z., Mathur, U., & Zhou, H. (2019, April 10). 12TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 12), pp. 2–11.
2018 conference paper
Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls
2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
2018 article
Developing Noise-Resistant Three-Dimensional Single Particle Tracking Using Deep Neural Networks
Zhong, Y., Li, C., Zhou, H., & Wang, G. (2018, August 24). Analytical Chemistry, Vol. 90, pp. 10748–10757.
2018 article
GPU Performance vs. Thread-Level Parallelism
Lin, Z., Mantor, M., & Zhou, H. (2018, March 22). ACM Transactions on Architecture and Code Optimization.
2017 article
Developing Dynamic Profiling and Debugging Support in OpenCL for FPGAs
Verma, A., Zhou, H., Booth, S., King, R., Coole, J., Keep, A., … Feng, W.-chun. (2017, June 13). Proceedings of the 2017 54th Acm/Edac/Ieee Design Automation Conference (Dac).
2017 conference paper
EffiSha: A Software Framework for Enabling Efficient Preemptive Scheduling of GPU
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 3–16.
2017 report
Exploring deep neural networks for branch prediction
[Technical Report]. https://people.engr.ncsu.edu/hzhou/CNN_DBN_zhou_2017.pdf
2017 article
Methylation specific targeting of a chromatin remodeling complex from sponges to humans
Cramer, J. M., Pohlmann, D., Gomez, F., Mark, L., Kornegay, B., Hall, C., … Williams, D. C. (2017, January 17). Scientific Reports, Vol. 7.
2017 conference paper
The Demand for a Sound Baseline in GPU Memory Architecture Research
14th Annual Workshop on Duplicating, Deconstructing and Debunking (WDDD). Presented at the Workshop on Duplicating, Deconstructing and Debunking, Toronto, Canada. https://people.engr.ncsu.edu/hzhou/Hongwen_WDDD2017.pdf
2016 journal article
A Cross-Platform SpMV Framework on Many-Core Architectures
ACM Transactions on Architecture and Code Optimization, 13(4), 1–25.
2016 article
A model-driven approach to warp/thread-block level GPU cache bypassing
Dai, H., Li, C., Zhou, H., Gupta, S., Kartsaklis, C., & Mantor, M. (2016, May 25). 2016 53rd Acm/Edac/Ieee Design Automation Conference (Dac).
2016 article
Enabling Efficient Preemption for SIMT Architectures with Lightweight Context Switching
Lin, Z., Nyland, L., & Zhou, H. (2016, November 1). SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 898–908.
2016 article
OpenCL-based erasure coding on heterogeneous architectures
Chen, N. G., Zhou, H., Shen, N. X., Gahm, J., Venkat, N., Booth, S., & Marshall, J. (2016, July 1). Ieee International Conference on Application-Specific Systems, Vol. 7, pp. 33–40.
2016 article
Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs
Li, C., Yang, Y., Feng, M., Chakradhar, S., & Zhou, H. (2016, November 1). SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 633–644.
2016 conference paper
Selective GPU Cache Bypassing for Un-Coalesced Loads
In X. Liao (Ed.), 22nd IEEE International Conference on Parallel and Distributed Systems : ICPADS 2016 : proceedings : 13-16 December 2016, Wuhan, Hubei, China.
Ed(s): X. Liao *
2016 article
Tuning Stencil codes in OpenCL for FPGAs
Jia, Q., & Zhou, H. (2016, October 1). Proceedings of the 34th Ieee International Conference on Computer Design (Iccd), pp. 249–256.
2015 conference paper
An Optimized AMPM-based Prefetcher Coupled with Configurable Cache Line Sizing
JILP Workshop on Computer Architecture Competitions (JWAC): 2nd Data Prefetching Championship (DPC2).
2015 article
Analyzing graphics processor unit (GPU) instruction set architectures
Mayank, K., Dai, H., Wei, J., & Zhou, H. (2015, March 1). Ieee International Symposium on Performance Analysis of Systems And, pp. 155–156.
2015 article
Automatic data placement into GPU on-chip memory resources
Li, C., Yang, Y., Lin, Z., & Zhou, H. (2015, February 1). 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 23–33.
2015 article
CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications
Yang, Y., Li, C., & Zhou, H. (2015, January 1). Journal of Computer Science and Technology, Vol. 30, pp. 3–19.
2015 conference paper
Locality-Driven Dynamic GPU Cache Bypassing
ICS '15: Proceedings of the 29th ACM on International Conference on Supercomputing, 61–77.
2015 conference paper
Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture
Proceedings of the 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 121–130.
2015 article
Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing
Gupta, S., & Zhou, H. (2015, September 1). 2015 44th International Conference on Parallel Processing (Icpp), pp. 150–159.
2014 conference paper
A Case for a Flexible Scalar Unit in SIMT Architecture
Proceedings of 2014 IEEE 28th International Parallel and Distributed Processing Symposium. Presented at the 978-1-4799-3799-8, Phoenix, AZ.
2014 chapter
A Highly Efficient FFT Using Shared-Memory Multiplexing
In Numerical Computations with GPUs (pp. 363–377).
2014 article
CUDA-NP
Yang, Y., & Zhou, H. (2014, February 6). ACM SIGPLAN Notices, Vol. 49, pp. 93–105.
2014 conference paper
Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs
Ieee international symposium on performance analysis of systems and, 231–241.
2014 article
Warp-level divergence in GPUs: Characterization, impact, and mitigation
Xiang, P., Yang, Y., & Zhou, H. (2014, February 1). International Symposium on High-Performance Computer, pp. 284–295.
2014 conference paper
yaSpM: Yet Another SpMV Framework on GPUs
Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 49(8), 107–118.
2013 article
Adaptive Cache Bypassing for Inclusive Last Level Caches
Gupta, S., Gao, H., & Zhou, H. (2013, May 1). Ieee 27th International Parallel and Distributed Processing Symposium (Ipdps 2013), pp. 1243–1253.
2013 journal article
Analyzing locality of memory references in GPU architectures
MSPC '13: Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 6.
2013 conference paper
Exploiting Uniform Vector Instructions for GPGPU Performance, Energy Efficiency, and Opportunistic Reliability Enhancement
Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, 433–442.
2013 article
Locality principle revisited: A probability-based quantitative approach
Gupta, S., Xiang, P., Yang, Y., & Zhou, H. (2013, February 5). Journal of Parallel and Distributed Computing, Vol. 73, pp. 1011–1027.
2012 article
A unified optimizing compiler framework for different GPGPU architectures
Yang, Y., Xiang, P., Kong, J., Mantor, M., & Zhou, H. (2012, June 1). ACM Transactions on Architecture and Code Optimization, Vol. 9.
2012 article
Architecting against Software Cache-Based Side-Channel Attacks
Kong, N. J., Aciicmez, O., Seifert, J.-P., & Zhou, N. H. (2012, April 12). IEEE Transactions on Computers, Vol. 62, pp. 1276–1288.
2012 conference paper
CPU-assisted GPGPU on fused CPU-GPU architectures
International symposium on high-performance computer, 103–114.
2012 conference paper
Fixing Performance Bugs: An Empirical Study of Open-Source GPGPU Programs
2012 41st International Conference on Parallel Processing. Presented at the 2012 41st International Conference on Parallel Processing (ICPP).
2012 article
Locality Principle Revisited: A Probability-Based Quantitative Approach
Gupta, S., Xiang, P., Yang, Y., & Zhou, H. (2012, May 1). 2012 Ieee 26th International Parallel and Distributed Processing Symposium (Ipdps), pp. 995–1009.
2012 conference paper
Shared Memory Multiplexing: A Novel Way to Improve GPGPU Throughput
Proceedings of the 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT). Presented at the 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, MN, USA.
2012 article
The Implementation of a High Performance GPGPU Compiler
Yang, Y., & Zhou, H. (2012, November 8). International Journal of Parallel Programming, Vol. 41, pp. 768–781.
2011 journal article
Combining Local and Global History for High Performance Data Prefetching
Journal of Instruction-Level Parallelism (JILP), 13, 1–14.
2011 conference paper
Developing a High Performance GPGPU Compiler using Cetus
Proceedings of the Cetus Users and Compiler Infrastructure Workshop, International Conference on Parallel Architectures and Compilation Techniques (PACT’11). Presented at the International Conference on Parallel Architectures and Compilation Techniques (PACT’11).
2011 journal article
Exploring Correlation for Indirect Branch Prediction
2nd JILP Workshop on Computer Architecture Competitions (JWAC-2): Championship Branch Prediction. Presented at the 2nd JILP Workshop on Computer Architecture Competitions (JWAC-2): Championship Branch Prediction, held with ISCA-38.
2011 conference paper
Time-Ordered Event Traces: A New Debugging Primitive for Concurrency Bugs
2011 IEEE International Parallel & Distributed Processing Symposium. Presented at the Distributed Processing Symposium (IPDPS).
2010 article
A GPGPU compiler for memory optimization and parallelism management
Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010, June 5). ACM SIGPLAN Notices, Vol. 45, pp. 86–97.
2010 conference paper
Accelerating MATLAB Image Processing Toolbox Functions on GPUs
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, 75–85.
2010 article
An optimizing compiler for GPGPU programs with input-data sharing
Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010, January 9). ACM SIGPLAN Notices, Vol. 45, pp. 343–344.
2010 article
An optimizing compiler for GPGPU programs with input-data sharing
Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010, January 9). ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 343–344.
2010 conference paper
Improving privacy and lifetime of PCM-based main memory
2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN). Presented at the Networks (DSN).
2009 conference paper
Anomaly-based bug prediction, isolation, and validation
Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09. Presented at the Proceeding of the 14th international conference.
2009 conference paper
Hardware-software integrated approaches to defend against software cache-based side channel attacks
2009 IEEE 15th International Symposium on High Performance Computer Architecture. Presented at the HPCA - 15 2009. IEEE 15th International Symposium on High Performance Computer Architecture.
2009 conference paper
Understanding software approaches for GPGPU reliability
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-2. Presented at the 2nd Workshop.
2008 conference paper
Address-branch correlation: A novel locality for long-latency hard-to-predict branches
2008 IEEE 14th International Symposium on High Performance Computer Architecture. Presented at the 2008 IEEE 14th International Symposium on High Performance Computer Architecture (HPCA).
2008 conference paper
Deconstructing new cache designs for thwarting software cache-based side channel attacks
Proceedings of the 2nd ACM workshop on Computer security architectures - CSAW '08. Presented at the the 2nd ACM workshop.
2007 article
Optimizing Dual-Core Execution for Power Efficiency and Transient-Fault Recovery
Ma, Y., Gao, H., Dimitrov, M., & Zhou, H. (2007, August 1). IEEE Transactions on Parallel and Distributed Systems, Vol. 18, pp. 1080–1093.
2007 journal article
PMPM: Prediction by combining multiple partial matches
Journal of Instruction-Level Parallelism, 9, 1–18.
2007 conference paper
Unified Architectural Support for Soft-Error Protection or Software Bug Detection
16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007). Presented at the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
2006 conference paper
Efficient Transient-Fault Tolerance for Multithreaded Processors Using Dual-Thread Execution
2006 International Conference on Computer Design. Presented at the 2006 International Conference on Computer Design.
2006 conference paper
Improving software security via runtime instruction-level taint checking
Proceedings of the 1st workshop on Architectural and system support for improving software dependability - ASID '06. Presented at the the 1st workshop.
2006 conference paper
Locality-based Information Redundancy for Processor Reliability
2nd Workshop on Architectural Reliability (WAR-2) held in conjunction with 39th International Symposium on Microarchitecture (MICRO-39), 29–36.
2006 conference paper
PMPM: Prediction by Combining Multiple Partial Matches
2nd Championship Branch Prediction (CBP-2) held with the 39th International Symposium on Microarchitecture (MICRO-39), 19–24.
2006 journal article
Using index functions to reduce conflict aliasing in branch prediction tables
IEEE Transactions on Computers, 55(8), 1057–1061.
2005 journal article
A case for fault tolerance and performance enhancement using chip multi-processors
IEEE Computer Architecture Letters, 4, 1–4.
2005 journal article
Adaptive information processing: an effective way to improve perceptron branch predictors
Journal of Instruction-Level Parallelism, 7, 1–10.
2005 conference paper
Code size efficiency in global scheduling for ILP processors
Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures. Presented at the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.
2005 conference paper
Detecting global stride locality in value streams
30th Annual International Symposium on Computer Architecture, 2003. Proceedings. Presented at the ISCA 2003: 30th International Symposium on Computer Architecture.
2005 conference paper
Dual-core execution: building a highly scalable single-thread instruction window
14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). Presented at the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
2005 article
Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction
Zhou, H., & Conte, T. M. (2005, May 24). IEEE Transactions on Computers, Vol. 54, pp. 897–912.
2004 conference paper
Adaptive Information Processing: An Effective Way to Improve Perceptron Branch Predictors
1st Championship Branch Prediction (CBP-1) held with the 37th International Symposium on Microarchitecture (MICRO-37).
2003 article
Adaptive mode control
Zhou, H., Toburen, M. C., Rotenberg, E., & Conte, T. M. (2003, August 1). ACM Transactions on Embedded Computing Systems, Vol. 2, pp. 347–372.
2003 report
Code size aware compilation for real-time applications
[Technical Report]. Computer Science Department, University of Central Florida.
2003 conference paper
Enhancing Memory Level Parallelism via Recovery-Free Value Prediction
The 2003 International Conference on Supercomputing (ICS'03), 326–335.
2003 report
Performance modeling of memory latency hiding techniques
[Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University.
2003 chapter
Tree Traversal Scheduling: A Global Instruction Scheduling Technique for VLIW/EPIC Processors
In Languages and Compilers for Parallel Computing (Vol. 2624, pp. 223–238).
Ed(s):
2002 article
Adaptive mode control: a static-power-efficient cache design
Zhou, N. H., Toburen, M. C., Rotenberg, E., & Conte, T. M. (2002, November 13). 2001 International Conference on Parallel Architectures and Compilation Techniques: Proceedings: 8-12 September, 2001, Barcelona, Catalunya, Spain, pp. 61–70.
2002 report
Using Performance Bounds to Guide Pre-scheduling Code Optimizations
[Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University.
2001 report
A Treegion-based Unified Approach to Speculation and Predication in Global Instruction Scheduling
[Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University.
2001 report
A study of value speculative execution and mispeculation recovery in superscalar microprocessors
[Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University.
2000 report
Adaptive Mode Control: A Low-Leakage Power-Efficient Cache Design
[Technical Report]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University.
2000 article
Automatic IC orientation checks
Kassim, A. A., Zhou, H., & Ranganath, S. (2000, October 1). Machine Vision and Applications, Vol. 12, pp. 107–112.
1998 article
A fast algorithm for detecting die extrusion defects in IC packages
Zhou, H., Kassim, A. A., & Ranganath, S. (1998, June 1). Machine Vision and Applications, Vol. 11, pp. 37–41.
1996 article
Test sequencing and diagnosis in electronic system with decision table
Zhou, H., Qu, L., & Li, A. (1996, September 1). Microelectronics Reliability, Vol. 36, pp. 1167–1175.