2018 journal article
Developing Noise-Resistant Three-Dimensional Single Particle Tracking Using Deep Neural Networks
ANALYTICAL CHEMISTRY, 90(18), 10748–10757.
2016 conference paper
A model-driven approach to warp/thread-block level CPU cache bypassing
2016 53rd acm/edac/ieee design automation conference (dac).
2016 conference paper
Optimizing memory efficiency for deep convolutional neural networks on GPUs
SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 633–644.
2015 conference paper
Automatic data placement into GPU on-chip memory resources
2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 23–33.
2015 journal article
CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications
Journal of Computer Science and Technology, 30(1), 3–19.
2014 conference paper
Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs
Ieee international symposium on performance analysis of systems and, 231–241.
2014 conference paper
yaSpM: Yet Another SpMV Framework on GPUs
Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 49(8), 107–118.
Event: 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming at Orlando, FL