Michela Becchi Zarch, M. E., & Becchi, M. (2023). A Code Transformation to Improve the Efficiency of OpenCL Code on FPGA through Pipes. PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2023, CF 2023, pp. 101–111. https://doi.org/10.1145/3587135.3592210 Ravi, J., Byna, S., Koziol, Q., Tang, H., & Becchi, M. (2023). Evaluating Asynchronous Parallel I/O on HPC Systems. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, pp. 211–221. https://doi.org/10.1109/IPDPS54959.2023.00030 Shah, M., Yu, X., Di, S., Lykov, D., Alexeev, Y., Becchi, M., & Cappello, F. (2023). GPU-Accelerated Error-Bounded Compression Framework for Quantum Circuit Simulations. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, pp. 757–767. https://doi.org/10.1109/IPDPS54959.2023.00081 Neff, R., Minutoli, M., Tumeo, A., & Becchi, M. (2023). High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization. PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2023, CF 2023, pp. 12–22. https://doi.org/10.1145/3587135.3592196 Shah, M., Yu, X., Di, S., Becchi, M., & Cappello, F. (2023). Lightweight Huffman Coding for Efficient GPU Compression. PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2023, pp. 99–110. https://doi.org/10.1145/3577193.3593736 Ravi, J., Byna, S., & Becchi, M. (2023). Runway: In-transit Data Compression on Heterogeneous HPC Systems. 2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID, pp. 229–239. https://doi.org/10.1109/CCGRID57682.2023.00030 Ravi, J., Byna, S., & Becchi, M. (2023). Runway: In-transit Data Compression on Heterogeneous HPC Systems. 2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING WORKSHOPS, CCGRIDW, pp. 340–342. https://doi.org/10.1109/CCGridW59191.2023.00078 Nguyen, T., & Becchi, M. (2022). A GPU-accelerated Data Transformation Framework Rooted in Pushdown Transducers. 2022 IEEE 29TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC, pp. 215–225. https://doi.org/10.1109/HiPC56025.2022.00038 Shah, M., Neff, R., Wu, H., Minutoli, M., Tumeo, A., & Becchi, M. (2022). Accelerating Random Forest Classification on GPU and FPGA. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022. https://doi.org/10.1145/3545008.3545067 Zarch, M. E., Neff, R., & Becchi, M. (2021). Exploring Thread Coarsening on FPGA. 2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021), pp. 436–441. https://doi.org/10.1109/HiPC53243.2021.00062 Ravi, J., Nguyen, T., Zhou, H., & Becchi, M. (2021). PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint. 2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021), pp. 442–447. https://doi.org/10.1109/HiPC53243.2021.00063 Gu, R., Beata, P., & Becchi, M. (2020). A Loop-aware Autotuner for High-Precision Floating-point Applications. 2020 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), pp. 285–295. https://doi.org/10.1109/ISPASS48437.2020.00048 Wu, H., & Becchi, M. (2020). Evaluating Thread Coarsening and Low-cost Synchronization on Intel Xeon Phi. 2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, pp. 1018–1029. https://doi.org/10.1109/IPDPS47924.2020.00108 Yu, X., Wei, F., Ou, X., Becchi, M., Bicer, T., & Yao, D. (2020). GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting. 2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, pp. 274–284. https://doi.org/10.1109/IPDPS47924.2020.00037 Gu, R., & Becchi, M. (2020). GPU-FPtuner: Mixed-precision Auto-tuning for Floating-point Applications on GPU. 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2020), pp. 294–304. https://doi.org/10.1109/HiPC50609.2020.00043 Nourian, M., Zarch, M. E., & Becchi, M. (2020). Optimizing Complex OpenCL Code for FPGA: A Case Study on Finite Automata Traversal. 2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), pp. 518–527. https://doi.org/10.1109/ICPADS51040.2020.00073 Gu, R., & Becchi, M. (2019). A Comparative Study of Parallel Programming Frameworks for Distributed GPU Applications. CF '19 - PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, pp. 268–273. https://doi.org/10.1145/3310273.3323071 Palumbo, F., & Becchi, M. (2019, March). Editorial: Special Issue on Computing Frontiers. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, Vol. 91, pp. 273–273. https://doi.org/10.1007/s11265-019-1439-2 Roy, I., Srivastava, A., Grimm, M., Nourian, M., Becchi, M., & Aluru, S. (2019). Evaluating High Performance Pattern Matching on the Automata Processor. IEEE TRANSACTIONS ON COMPUTERS, 68(8), 1201–1212. https://doi.org/10.1109/TC.2019.2901466 Nourian, M., Wu, H., & Becchi, M. (2018). A Compiler Framework for Fixed-topology Non-deterministic Finite Automata on SIMD Platforms. 2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), pp. 507–516. https://doi.org/10.1109/ICPADS.2018.00073 Wu, H., Ravi, J., & Becchi, M. (2018). Compiling SIMT Programs on Multi- and Many-core Processors with Wide Vector Units: A Case Study with CUDA. 2018 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), pp. 123–132. https://doi.org/10.1109/HiPC.2018.00022 Procter, A., Harrison, W. L., Graves, I., Becchi, M., & Allwein, G. (2017). A Principled Approach to Secure Multi-core Processor Design with ReWire. ACM Transactions on Embedded Computing Systems, 16(2), 1–25. https://doi.org/10.1145/2967497 Wu, H., & Becchi, M. (2017). An Analytical Study of Recursive Tree Traversal Patterns on Multi- and Many-core Platforms. 2017 IEEE 23RD INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), pp. 586–595. https://doi.org/10.1109/ICPADS.2017.00082 Surineni, S., Gu, R. D., Nguyen, H., & Becchi, M. (2017). Understanding the performance-accuracy tradeoffs of floating-point arithmetic on GPUs. Proceedings of the 2017 ieee international symposium on workload characterization (iiswc), 207–218. https://doi.org/10.1109/iiswc.2017.8167778 Chen, X., Jones, B., Becchi, M., & Wolf, T. (2016). Picking Pesky Parameters: Optimizing Regular Expression Matching in Practice. IEEE Transactions on Parallel and Distributed Systems, 27(5), 1430–1442. https://doi.org/10.1109/tpds.2015.2453986 Graves, I., Procter, A., Harrison, W. L., Becchi, M., & Allwein, G. (2015). Hardware Synthesis from Functional Embedded Domain-Specific Languages: A Case Study in Regular Expression Compilation. In Lecture Notes in Computer Science (pp. 41–52). https://doi.org/10.1007/978-3-319-16214-0_4 Truong, H., Li, D., Sajjapongse, K., Conant, G., & Becchi, M. (2014). Large-Scale Pairwise Alignments on GPU Clusters: Exploring the Implementation Space. Journal of Signal Processing Systems, 77(1-2), 131–149. https://doi.org/10.1007/s11265-014-0883-2 Yu, X., Lin, B., & Becchi, M. (2014). Revisiting State Blow-Up: Automatically Building Augmented-FA While Preserving Functional Equivalence. IEEE Journal on Selected Areas in Communications, 32(10), 1822–1833. https://doi.org/10.1109/jsac.2014.2358840 Becchi, M., & Crowley, P. (2013). A-DFA. ACM Transactions on Architecture and Code Optimization, 10(1), 1–26. https://doi.org/10.1145/2445572.2445576 Ellison, M. J., Conant, G. C., Cockrum, R. R., Austin, K. J., Truong, H., Becchi, M., … Cammack, K. M. (2013). Diet Alters Both the Structure and Taxonomy of the Ovine Gut Microbial Ecosystem. DNA Research, 21(2), 115–125. https://doi.org/10.1093/dnares/dst044 Poostchi, M., Palaniappan, K., Bunyak, F., Becchi, M., & Seetharaman, G. (2013). Efficient GPU Implementation of the Integral Histogram. In Computer Vision - ACCV 2012 Workshops (pp. 266–278). https://doi.org/10.1007/978-3-642-37410-4_23 Ravi, V. T., Becchi, M., Jiang, W., Agrawal, G., & Chakradhar, S. (2013). Scheduling concurrent applications on a cluster of CPU–GPU nodes. Future Generation Computer Systems, 29(8), 2262–2271. https://doi.org/10.1016/j.future.2013.06.002 Majumdar, A., Cadambi, S., Becchi, M., Chakradhar, S. T., & Graf, H. P. (2012). A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification. ACM Transactions on Architecture and Code Optimization, 9(1), 1–30. https://doi.org/10.1145/2133382.2133388 Pang, B., Zhao, N., Becchi, M., Korkin, D., & Shyu, C.-R. (2012). Accelerating large-scale protein structure alignments with graphics processing units. BMC Research Notes, 5(1), 116. https://doi.org/10.1186/1756-0500-5-116 Becchi, M., & Crowley, P. (2008). Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures. The Journal of Instruction-Level Parallelism (JILP), 10.