Huiyang Zhou Yudha, A. W. B., Xue, J., Lou, Q., Zhou, H., & Solihin, Y. (2024). BoostCom: Towards Efficient Universal Fully Homomorphic Encryption by Boosting the Word-wise Comparisons. PROCEEDINGS OF THE 2024 THE INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2024, pp. 121–132. Yuan, S., Awad, A., & Zhou, H. (2025). Delta Counter: Bandwidth-Efficient Encryption Counter Representation for Secure GPU Memory. IEEE Transactions on Dependable and Secure Computing. Li, P., Liu, J., Gonzales, A., Saleem, Z. H., Zhou, H., & Hovland, P. (2024). QuTracer: Mitigating Quantum Gate and Measurement Errors by Tracing Subsets of Qubits. 2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, pp. 103–117. Abdullah, R., Lee, H., Zhou, H., & Awad, A. (2024). Salus: Efficient Security Support for CXL-Expanded GPU Memory. 2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024, pp. 233–248. Jin, Y., Li, Z., Hua, F., Hao, T., Zhou, H., Huang, Y., & Zhang, E. Z. (2024). Tetris: A Compilation Framework for VQA Applications in Quantum Computing. 2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, pp. 277–292. Long, X., Gong, X., Zhang, B., & Zhou, H. (2023). An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory. JOURNAL OF GRID COMPUTING, 21(1). Li, P., Liu, J., Patil, H. P., Hovland, P., & Zhou, H. (2023). Enhancing Virtual Distillation with Circuit Cutting for Quantum Error Mitigation. 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, pp. 94–101. Tozlu, Y. S., & Zhou, H. (2023). PBVR: Physically Based Rendering in Virtual Reality. 2023 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, IISWC, pp. 77–86. Abdullah, R., Zhou, H., & Awad, A. (2023). Plutus: Bandwidth-Efficient Memory Security for GPUs. 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, pp. 543–555. Freij, A., Zhou, H., & Solihin, Y. (2023). SecPB: Architectures for Secure Non-Volatile Memory with Battery-Backed Persist Buffers. 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, pp. 677–690. Yuan, S., Awad, A., Yudha, A. W. B., Solihin, Y., & Zhou, H. (2022). Adaptive Security Support for Heterogeneous Memory on GPUs. 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), pp. 213–228. Long, X., Gong, X., Zhang, B., & Zhou, H. (2023). Deep learning based data prefetching in CPU-GPU unified virtual memory. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 174, 19–31. Li, P., Liu, J., Li, Y., & Zhou, H. (2022). Exploiting Quantum Assertions for Error Mitigation and Quantum Program Debugging. 2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 124–131. Yudha, A. W. B., Meyer, J., Yuan, S., Zhou, H., & Solihin, Y. (2022). LITE: A Low-Cost Practical Inter-Operable GPU TEE. PROCEEDINGS OF THE 36TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2022. Liu, J., Li, P., & Zhou, H. (2022). Not All SWAPs Have the Same Cost: A Case for Optimization-Aware Qubit Routing. 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), pp. 709–725. Zhao, C., Gao, W., Nie, F., & Zhou, H. (2022). A Survey of GPU Multitasking Methods Supported by Hardware Architecture. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 33(6), 1451–1463. Yuan, S., Yudha, A. W. B., Solihin, Y., & Zhou, H. (2021). Analyzing Secure Memory Architecture for GPUs. 2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021), pp. 59–69. Ravi, J., Nguyen, T., Zhou, H., & Becchi, M. (2021). PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint. 2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021), pp. 442–447. Liu, J., Bello, L., & Zhou, H. (2021). Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits. CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), pp. 301–314. Liu, J., & Zhou, H. (2021). Systematic Approaches for Precise and Approximate Quantum State Runtime Assertion. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), pp. 179–193. Mao, Y., Zhou, H., Gui, X., & Shen, J. (2020). Exploring Convolution Neural Network for Branch Prediction. IEEE Access, 8, 152008–152016. Zhao, C., Gao, W., Nie, F., Wang, F., & Zhou, H. (2020). Fair and cache blocking aware warp scheduling for concurrent kernel execution on GPU. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 112, 1093–1105. Liu, J., Byrd, G., & Zhou, H. (2020). Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation. ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, 1017–1030. Liu, J., & Zhou, H. (2020). Reliability Modeling of NISQ-Era Quantum Computers. 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020), pp. 94–105. Yudha, A. W. B., Kimura, K., Zhou, H., & Solihin, Y. (2020). Scalable and Fast Lazy Persistency on GPUs. 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020), pp. 252–263. Lin, Z., Dai, H., Mantor, M., & Zhou, H. (2019). Coordinated CTA Combination and Bandwidth Partitioning for GPU Concurrent Kernel Execution. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(3). Lin, Z., Alshboul, M., Solihin, Y., & Zhou, H. (2019). Exploring Memory Persistency Models for GPUs. 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), 310–322. Guan, H., Ning, L., Lin, Z., Shen, X., Zhou, H., & Lim, S. (2019). In-Place Zero-Space Memory Protection for CNN. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32). San Mateo, CA: Morgan Kaufmann Publishers. Zhou, H., & Byrd, G. T. (2019). Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation. IEEE Computer Architecture Letters, 18(2), 111–114. Liu, J., Byrd, G., & Zhou, H. (2019, December 9). Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation. Liu, J., Byrd, G., & Zhou, H. (2019, December 9). Quantum Circuits for Dynamic Runtime Assertions in Quantum Computation. Lin, Z., Mathur, U., & Zhou, H. (2019). Scatter-and-Gather Revisited: High-Performance Side-Channel-Resistant AES on GPUs. 12TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 12), pp. 2–11. Dai, H., Lin, Z., Li, C., Zhao, C., Wang, F., Zheng, N., & Zhou, H. (2018). Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls. 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). Zhong, Y., Li, C., Zhou, H., & Wang, G. (2018). Developing Noise-Resistant Three-Dimensional Single Particle Tracking Using Deep Neural Networks. ANALYTICAL CHEMISTRY, 90(18), 10748–10757. Lin, Z., Mantor, M., & Zhou, H. (2018). GPU Performance vs. Thread-Level Parallelism: Scalability Analysis and a Novel Way to Improve TLP. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 15(1). Verma, A., Zhou, H., Booth, S., King, R., Coole, J., Keep, A., … Feng, W.-chun. (2017). Developing Dynamic Profiling and Debugging Support in OpenCL for FPGAs. PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC). Chen, G., Zhao, Y., Shen, X., & Zhou, H. (2017). EffiSha: A Software Framework for Enabling Efficient Preemptive Scheduling of GPU. PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 3–16. Mao, Y., Zhou, H., & Gui, X. (2017). Exploring deep neural networks for branch prediction [Technical Report]. Retrieved from Electrical and Computer Engineering Department, N.C. State University website: Cramer, J. M., Pohlmann, D., Gomez, F., Mark, L., Kornegay, B., Hall, C., … Williams, D. C., Jr. (2017). Methylation specific targeting of a chromatin remodeling complex from sponges to humans. SCIENTIFIC REPORTS, 7. Dai, H., Li, C., Lin, Z., & Zhou, H. (2017). The Demand for a Sound Baseline in GPU Memory Architecture Research. 14th Annual Workshop on Duplicating, Deconstructing and Debunking (WDDD). Presented at the Workshop on Duplicating, Deconstructing and Debunking, Toronto, Canada. Retrieved from Zhang, Y., Li, S., Yan, S., & Zhou, H. (2016). A Cross-Platform SpMV Framework on Many-Core Architectures. ACM Transactions on Architecture and Code Optimization, 13(4), 1–25. Dai, H., Li, C., Zhou, H., Gupta, S., Kartsaklis, C., & Mantor, M. (2016). A Model-Driven Approach to Warp/Thread-Block Level GPU Cache Bypassing. 2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC). Lin, Z., Nyland, L., & Huiyang. (2016). Enabling efficient preemption for SIMT architectures with lightweight context switching. SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 898–908. Chen, G. Y., Huiyang, Shen, X., Gahm, J., Venkat, N., Booth, S., & Marshall, J. (2016). Opencl-based erasure coding on heterogeneous architectures. Ieee international conference on application-specific systems, 7, 33–40. Li, C., Yang, Y., Feng, M., Chakradhar, S., & Huiyang. (2016). Optimizing memory efficiency for deep convolutional neural networks on GPUs. SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 633–644. Zhao, C., Wang, F., Lin, Z., Zhou, H., & Zheng, N. (2016). Selective GPU Cache Bypassing for Un-Coalesced Loads. In X. Liao (Ed.), 22nd IEEE International Conference on Parallel and Distributed Systems : ICPADS 2016 : proceedings : 13-16 December 2016, Wuhan, Hubei, China. Jia, Q., & Huiyang. (2016). Tuning stencil codes in opencl for fpgas. Proceedings of the 34th ieee international conference on computer design (iccd), 249–256. Jia, Q., Padia, M. B., Amboju, K., & Zhou, H. (2015). An Optimized AMPM-based Prefetcher Coupled with Configurable Cache Line Sizing. JILP Workshop on Computer Architecture Competitions (JWAC): 2nd Data Prefetching Championship (DPC2). Mayank, K., Dai, H. W., Wei, J. Z., & Huiyang. (2015). Analyzing graphics processor unit (GPU) instruction set architectures. Ieee international symposium on performance analysis of systems and, 155–156. Li, C., Yang, Y., Lin, Z., & Huiyang. (2015). Automatic data placement into GPU on-chip memory resources. 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 23–33. Yang, Y., Li, C., & Zhou, H. (2015). CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 30(1), 3–19. Li, C., Song, S., Dai, H., Sidelnik, A., Hari, S., & Zhou, H. (2015). Locality-Driven Dynamic GPU Cache Bypassing. ICS '15: Proceedings of the 29th ACM on International Conference on Supercomputing, 61–77. Xiang, P., Yang, Y., Mantor, M., Rubin, N., & Zhou, H. (2015). Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture. Proceedings of the 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 121–130. Gupta, S., & Zhou, H. (2015). Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), pp. 150–159. Yang, Y., Xiang, P., Mantor, M., Rubin, N., Hsu, L., Dong, Q., & Zhou, H. (2014). A Case for a Flexible Scalar Unit in SIMT Architecture. Proceedings of 2014 IEEE 28th International Parallel and Distributed Processing Symposium. Presented at the 978-1-4799-3799-8, Phoenix, AZ. Yang, Y., & Zhou, H. (2014). A Highly Efficient FFT Using Shared-Memory Multiplexing. In Numerical Computations with GPUs (pp. 363–377). Yang, Y., & Zhou, H. (2014). CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications. ACM SIGPLAN NOTICES, 49(8), 93–105. Li, C., Yang, Y., Dai, H. W., Yan, S. G., Mueller, F., & Zhou, H. Y. (2014). Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs. Ieee international symposium on performance analysis of systems and, 231–241. Xiang, P., Yang, Y., & Huiyang. (2014). Warp-level divergence in GPUs: Characterization, impact, and mitigation. International symposium on high-performance computer, 284–295. Yan, S., Li, C., Zhang, Y., & Zhou, H. (2014). yaSpM: Yet Another SpMV Framework on GPUs. Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 49(8), 107–118. Gupta, S., Gao, H., & Zhou, H. (2013). Adaptive Cache Bypassing for Inclusive Last Level Caches. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), pp. 1243–1253. Gupta, S., Xiang, P., & Zhou, H. (2013). Analyzing locality of memory references in GPU architectures. MSPC '13: Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 6. Xiang, P., Yang, Y., Mantor, M., Rubin, N., Hsu, L., & Zhou, H. (2013). Exploiting Uniform Vector Instructions for GPGPU Performance, Energy Efficiency, and Opportunistic Reliability Enhancement. Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, 433–442. Gupta, S., Xiang, P., Yang, Y., & Zhou, H. (2013). Locality principle revisited: A probability-based quantitative approach. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 73(7), 1011–1027. Yang, Y., Xiang, P., Kong, J., Mantor, M., & Zhou, H. (2012). A Unified Optimizing Compiler Framework for Different GPGPU Architectures. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 9(2). Kong, J., Aciicmez, O., Seifert, J.-P., & Zhou, H. (2013). Architecting against Software Cache-Based Side-Channel Attacks. IEEE TRANSACTIONS ON COMPUTERS, 62(7), 1276–1288. Yang, Y., Xiang, P., Mantor, M., & Zhou, H. Y. (2012). CPU-assisted GPGPU on fused CPU-GPU architectures. International symposium on high-performance computer, 103–114. Yang, Y., Xiang, P., Mantor, M., & Zhou, H. (2012). Fixing Performance Bugs: An Empirical Study of Open-Source GPGPU Programs. 2012 41st International Conference on Parallel Processing. Presented at the 2012 41st International Conference on Parallel Processing (ICPP). Gupta, S., Xiang, P., Yang, Y., & Huiyang. (2012). Locality principle revisited: A probability-based quantitative approach. 2012 ieee 26th international parallel and distributed processing symposium (ipdps), 995–1009. Yang, Y., Xiang, P., Mantor, M., Rubin, N., & Zhou, H. (2012). Shared Memory Multiplexing: A Novel Way to Improve GPGPU Throughput. Proceedings of the 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT). Presented at the 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, MN, USA. Yang, Y., & Zhou, H. (2013). The Implementation of a High Performance GPGPU Compiler. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 41(6), 768–781. Dimitrov, M., & Zhou, H. (2011). Combining Local and Global History for High Performance Data Prefetching. Journal of Instruction-Level Parallelism (JILP), 13, 1–14. Yang, Y., & Zhou, H. (2011). Developing a High Performance GPGPU Compiler using Cetus. Proceedings of the Cetus Users and Compiler Infrastructure Workshop, International Conference on Parallel Architectures and Compilation Techniques (PACT’11). Presented at the International Conference on Parallel Architectures and Compilation Techniques (PACT’11). Bhansali, N., Panirwla, C., & Zhou, H. (2011). Exploring Correlation for Indirect Branch Prediction. 2nd JILP Workshop on Computer Architecture Competitions (JWAC-2): Championship Branch Prediction. Presented at the 2nd JILP Workshop on Computer Architecture Competitions (JWAC-2): Championship Branch Prediction, held with ISCA-38. Dimitrov, M., & Zhou, H. (2011). Time-Ordered Event Traces: A New Debugging Primitive for Concurrency Bugs. 2011 IEEE International Parallel & Distributed Processing Symposium. Presented at the Distributed Processing Symposium (IPDPS). Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010, June). A GPGPU Compiler for Memory Optimization and Parallelism Management. ACM SIGPLAN NOTICES, Vol. 45, pp. 86–97. Kong, J., Dimitrov, M., Yang, Y., Liyanage, J., Cao, L., Staples, J., … Zhou, H. (2010). Accelerating MATLAB Image Processing Toolbox Functions on GPUs. Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, 75–85. Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010, May). An Optimizing Compiler for GPGPU Programs with Input-Data Sharing. ACM SIGPLAN NOTICES, Vol. 45, pp. 343–344. Yang, Y., Xiang, P., Kong, J., & Zhou, H. (2010). An Optimizing Compiler for GPGPU Programs with Input-Data Sharing. PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, pp. 343–344. Kong, J., & Zhou, H. (2010). Improving privacy and lifetime of PCM-based main memory. 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN). Presented at the Networks (DSN). Dimitrov, M., & Zhou, H. (2009). Anomaly-based bug prediction, isolation, and validation. Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09. Presented at the Proceeding of the 14th international conference. Kong, J., Aciicmez, O., Seifert, J.-P., & Zhou, H. (2009). Hardware-software integrated approaches to defend against software cache-based side channel attacks. 2009 IEEE 15th International Symposium on High Performance Computer Architecture. Presented at the HPCA - 15 2009. IEEE 15th International Symposium on High Performance Computer Architecture. Dimitrov, M., Mantor, M., & Zhou, H. (2009). Understanding software approaches for GPGPU reliability. Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-2. Presented at the 2nd Workshop. Gao, H., Ma, Y., Dimitrov, M., & Zhou, H. (2008). Address-branch correlation: A novel locality for long-latency hard-to-predict branches. 2008 IEEE 14th International Symposium on High Performance Computer Architecture. Presented at the 2008 IEEE 14th International Symposium on High Performance Computer Architecture (HPCA). Kong, J., Aciicmez, O., Seifert, J.-P., & Zhou, H. (2008). Deconstructing new cache designs for thwarting software cache-based side channel attacks. Proceedings of the 2nd ACM workshop on Computer security architectures - CSAW '08. Presented at the the 2nd ACM workshop. Ma, Y., Gao, H., Dimitrov, M., & Zhou, H. (2007). Optimizing dual-core execution for power efficiency and transient-fault recovery. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 18(8), 1080–1093. Gao, H., & Zhou, H. (2007). PMPM: Prediction by combining multiple partial matches. Journal of Instruction-Level Parallelism, 9, 1–18. Dimitrov, M., & Zhou, H. (2007). Unified Architectural Support for Soft-Error Protection or Software Bug Detection. 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007). Presented at the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007). Ma, Y., & Zhou, H. (2006). Efficient Transient-Fault Tolerance for Multithreaded Processors Using Dual-Thread Execution. 2006 International Conference on Computer Design. Presented at the 2006 International Conference on Computer Design. Kong, J., Zou, C. C., & Zhou, H. (2006). Improving software security via runtime instruction-level taint checking. Proceedings of the 1st workshop on Architectural and system support for improving software dependability - ASID '06. Presented at the the 1st workshop. Dimitrov, M., & Zhou, H. (2006). Locality-based Information Redundancy for Processor Reliability. 2nd Workshop on Architectural Reliability (WAR-2) held in conjunction with 39th International Symposium on Microarchitecture (MICRO-39), 29–36. Gao, H., & Zhou, H. (2006). PMPM: Prediction by Combining Multiple Partial Matches. 2nd Championship Branch Prediction (CBP-2) held with the 39th International Symposium on Microarchitecture (MICRO-39), 19–24. Ma Y., G. H., & Zhou, H. (2006). Using index functions to reduce conflict aliasing in branch prediction tables. IEEE Transactions on Computers, 55(8), 1057–1061. Zhou, H. (2005). A case for fault tolerance and performance enhancement using chip multi-processors. IEEE Computer Architecture Letters, 4, 1–4. Gao, H., & Zhou, H. (2005). Adaptive information processing: an effective way to improve perceptron branch predictors. Journal of Instruction-Level Parallelism, 7, 1–10. Zhou, H., & Conte, T. M. (2005). Code size efficiency in global scheduling for ILP processors. Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures. Presented at the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures. Zhou, H., Flanagan, J., & Conte, T. M. (2005). Detecting global stride locality in value streams. 30th Annual International Symposium on Computer Architecture, 2003. Proceedings. Presented at the ISCA 2003: 30th International Symposium on Computer Architecture. Zhou, H. (2005). Dual-core execution: building a highly scalable single-thread instruction window. 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). Presented at the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). Huiyang, & Conte, T. M. (2005). Enhancing memory-level parallelism via recovery-free value prediction. IEEE Transactions on Computers, 54, 897–912. Gao, H., & Zhou, H. (2004). Adaptive Information Processing: An Effective Way to Improve Perceptron Branch Predictors. 1st Championship Branch Prediction (CBP-1) held with the 37th International Symposium on Microarchitecture (MICRO-37). Huiyang, Toburen, M. C., Rotenberg, E., & Conte, T. M. (2003). Adaptive mode control: A static-power-efficient cache design. ACM Transactions on Embedded Computing Systems, 2(3), 347–372. Zhou, H. (2003). Code size aware compilation for real-time applications [Technical Report]. Computer Science Department, University of Central Florida. Zhou, H., & Conte, T. M. (2003). Enhancing Memory Level Parallelism via Recovery-Free Value Prediction. The 2003 International Conference on Supercomputing (ICS'03), 326–335. Zhou, H., & Conte, T. M. (2003). Performance modeling of memory latency hiding techniques [Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University. Zhou, H., Jennings, M. D., & Conte, T. M. (2003). Tree Traversal Scheduling: A Global Instruction Scheduling Technique for VLIW/EPIC Processors. In Languages and Compilers for Parallel Computing (Vol. 2624, pp. 223–238). Huiyang, Toburen, M. C., Rotenberg, E., & Conte, T. M. (2001). Adaptive mode control: A static-power-efficient cache design. 2001 International Conference on Parallel Architectures and Compilation Techniques: Proceedings: 8-12 September, 2001, Barcelona, Catalunya, Spain, 61–70. Zhou, H., & Conte, T. M. (2002). Using Performance Bounds to Guide Pre-scheduling Code Optimizations [Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University. Jennings, M. D., Zhou, H., & Conte, T. M. (2001). A Treegion-based Unified Approach to Speculation and Predication in Global Instruction Scheduling [Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University. Zhou, H., Fu, C., Rotenberg, E., & Conte, T. (2001). A study of value speculative execution and mispeculation recovery in superscalar microprocessors [Technical Report,]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University. Zhou, H., Toburen, M., Rotenberg, E., & Conte, T. (2000). Adaptive Mode Control: A Low-Leakage Power-Efficient Cache Design [Technical Report]. Raleigh, NC: Department of Electrical and Computer Engineering, North Carolina State University. Kassim, A. A., Huiyang, & Raganath, S. (2000). Automatic IC orientation checks. Machine Vision and Applications, 12(3), 107–112. Zhou, H., Kassim, A. A., & Ranganath, S. (1998). A fast algorithm for detecting die extrusion defects in IC packages. MACHINE VISION AND APPLICATIONS, 11(1), 37–41. Zhou, H. Y., Qu, L. S., & Li, A. H. (1996). Test sequencing and diagnosis in electronic system with decision table. MICROELECTRONICS AND RELIABILITY, 36(9), 1167–1175.