Chao Li Zhong, Y., Li, C., Zhou, H., & Wang, G. (2018). Developing Noise-Resistant Three-Dimensional Single Particle Tracking Using Deep Neural Networks. ANALYTICAL CHEMISTRY, 90(18), 10748–10757. https://doi.org/10.1021/acs.analchem.8b01334 Dai, H., Li, C., Zhou, H., Gupta, S., Kartsaklis, C., & Mantor, M. (2016). A Model-Driven Approach to Warp/Thread-Block Level GPU Cache Bypassing. 2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC). https://doi.org/10.1145/2897937.2897966 Li, C., Yang, Y., Feng, M., Chakradhar, S., & Huiyang. (2016). Optimizing memory efficiency for deep convolutional neural networks on GPUs. SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 633–644. https://doi.org/10.1109/sc.2016.53 Li, C., Yang, Y., Lin, Z., & Zhou, H. Y. (2015). Automatic data placement into GPU on-chip memory resources. 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 23–33. https://doi.org/10.1109/cgo.2015.7054184 Yang, Y., Li, C., & Zhou, H. (2015). CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 30(1), 3–19. https://doi.org/10.1007/s11390-015-1500-y Li, C., Yang, Y., Dai, H. W., Yan, S. G., Mueller, F., & Zhou, H. Y. (2014). Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs. Ieee international symposium on performance analysis of systems and, 231–241. Yan, S., Li, C., Zhang, Y., & Zhou, H. (2014). yaSpM: Yet Another SpMV Framework on GPUs. Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 49(8), 107–118. https://doi.org/10.1145/2692916.2555255