Works (44)

Updated: November 20th, 2023 08:02

2022 article

LITE: A Low-Cost Practical Inter-Operable GPU TEE

PROCEEDINGS OF THE 36TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2022.

By: A. Yudha*, J. Meyer*, S. Yuan n, H. Zhou n & Y. Solihin*

author keywords: GPU TEE; software encryption; memory encryption; GPU enclave
TL;DR: This paper proposes a flexible GPU memory encryption design called LITE that relies on software memory encryption aided by small architecture support and shows that GPU applications can be adapted to the use of LITE encryption APIs without major changes. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: November 13, 2023

2021 article

Seeds of SEED: New Security Challenges for Persistent Memory

2021 INTERNATIONAL SYMPOSIUM ON SECURE AND PRIVATE EXECUTION ENVIRONMENT DESIGN (SEED 2021), pp. 83–88.

By: N. Ul Mustafa*, Y. Xu n, X. Shen n & Y. Solihin*

author keywords: Persistent memory objects; Security attacks; PMO vulnerability
TL;DR: Security implications of using the PMO, highlighting sample PMO-based attacks and potential strategies to defend against them, and threat vulnerabilities that are either new or increased in intensity under PMO programming model are discussed. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Sources: Web Of Science, ORCID
Added: June 20, 2022

2020 article

Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects

2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), pp. 680–692.

By: Y. Xu n, C. Ye*, Y. Solihin* & X. Shen n

author keywords: Persistent Memory Objects; Memory Protection Keys; Intra-process Isolation
TL;DR: This paper presents two novel architecture supports, which provide 11 - 52 × higher efficiency while offering the first known domain-based protection for PMOs. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Sources: Web Of Science, ORCID
Added: March 8, 2021

2019 journal article

Compiler-support for Critical Data Persistence in NVM

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(4).

author keywords: Compiler-support; NVM; data persistence; valid recovery
TL;DR: This article presents a compiler-support that automatically inserts complex instructions into kernels to achieve NVM data-persistence based on a simple programmer directive and shows that the proposed compiler- support outperforms the most recent checkpointing techniques while its performance overheads are insignificant. (via Semantic Scholar)
Source: Web Of Science
Added: January 13, 2020

2019 journal article

Efficient Checkpointing with Recompute Scheme for Non-volatile Main Memory

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 16(2).

By: M. Alshboul n, H. Elnawawy n, R. Elkhouly*, K. Kimura*, J. Tuck n & Y. Solihin*

author keywords: Memory systems; emerging memory technologies; computer architecture
TL;DR: A novel recompute-based failure safety approach that removes the need to keep checkpoints or logs, thus reducing execution time overheads and improving NVMM write endurance at the expense of more complex recovery. (via Semantic Scholar)
Source: Web Of Science
Added: August 5, 2019

2019 conference paper

Exploring Memory Persistency Models for GPUs

28th International Conference on Parallel Architectures and Compilation Techniques (PACT), 310–322.

By: Z. Lin n, M. Alshboul n, Y. Solihin* & H. Zhou n

Event: International Conference on Parallel Architectures and Compilation Techniques at Seattle, WA on September 21-25, 2019

TL;DR: This paper adapt, re-architect, and optimize CPU persistency models for GPU, and design a pragma-based compiler scheme for expressing persistency model for GPUs, and identifies that the thread hierarchy in GPUs offers intuitive scopes to form epochs and durable transactions. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: August 10, 2020

2018 article

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique

2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), pp. 439–451.

By: M. Alshboul n, J. Tuck n & Y. Solihin n

author keywords: Emerging Memory Technology; Memory Systems; Multi-core and Parallel Architectures
TL;DR: This work proposes Lazy Persistency (LP), a software persistency technique that allows caches to slowly send dirty blocks to the NVMM through natural evictions, and reduces the execution time and write amplification overheads from 9% and 21% to only 1% and 3%, respectively. (via Semantic Scholar)
Source: Web Of Science
Added: March 4, 2019

2018 article

Scheduling Page Table Walks for Irregular GPU Applications

2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), pp. 180–192.

By: S. Shin n, G. Cox*, M. Oskin*, G. Loh*, Y. Solihin n, A. Bhattacharjee*, A. Basu*

author keywords: Computer architecture; GPU; Virtual address
TL;DR: This work discovers that the order of servicing GPU's address translation requests plays a key role in determining the amount of translation overhead experienced by an application, and shows that better forward progress is achieved by prioritizing translation requests from the instructions that require less work to service their address translation needs. (via Semantic Scholar)
Source: Web Of Science
Added: March 4, 2019

2017 conference paper

Clone morphing: Creating new workload behavior from existing applications

Ieee international symposium on performance analysis of systems and, 97–107.

By: Y. Wang n, A. Awad n & Y. Solihin n

TL;DR: Cl clone morphing is proposed, a systematic method for producing new synthetic workloads (morphs) with performance behavior that does not currently exist and validated that morphs can be used for projecting future workloads and for generating new behavior that fills up the behavior map densely. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2017 article

Hiding the Long Latency of Persist Barriers Using Speculative Execution

44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), pp. 175–186.

By: S. Shin n, J. Tuck n & Y. Solihin n

author keywords: Non-Volatile Main Memory; Speculative Persistence; Failure Safety
TL;DR: This work describes how a new set of persistence instructions work and how they can be used to implement write-ahead logging based transactions and proposes a speculative persistence architecture that reduces the execution time overheads to only 3.6%. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2017 article

ObfusMem: A Low-Overhead Access Obfuscation for Trusted Memories

44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), pp. 107–119.

By: A. Awad*, Y. Wang n, D. Shands* & Y. Solihin n

author keywords: Access Pattern Ofuscation; Hardware Security; ORAM; Emerging Memory Technologies
TL;DR: This work proposes a new approach to access pattern obfuscation, called ObfusMem, which adds the memory to the trusted computing base and incorporates cryptographic engines within the memory, and encrypts commands and addresses on the memory bus, hence the access pattern is cryptographically obfuscated from external observers. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2017 journal article

Significant and sustaining elevation of blood oxygen induced by Chinese cupping therapy as assessed by near-infrared spectroscopy

BIOMEDICAL OPTICS EXPRESS, 8(1), 223–229.

TL;DR: Promotion indicates potential positive therapeutic effect of cupping therapy in hemodynamics for facilitating muscular functions and a prominent drop in [Hb] and a significant elevation in [O2] in the tissue surrounding the cupping site were observed during both cupping and post-treatment, manifesting the enhancement of oxygen uptake. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2016 article

Dense Footprint Cache: Capacity-Efficient Die-Stacked DRAM Last Level Cache

MEMSYS 2016: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, pp. 191–203.

By: S. Shin n, S. Kim* & Y. Solihin n

author keywords: Die-stacked DRAM; last-level cache; replacement policy
TL;DR: Dense Footprint Cache is proposed, a new design of Last Level Cache that uses a large Mblock and relies on useful block prediction in order to reduce memory bandwidth consumption and increase capacity and power efficiency. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2016 article

Silent Shredder: Zero-Cost Shredding for Secure Non-Volatile Main Memory Controllers

Awad, A., Manadhata, P., Haber, S., Solihin, Y., & Horne, W. (2016, April). ACM SIGPLAN NOTICES, Vol. 51, pp. 263–276.

By: A. Awad n, P. Manadhata*, S. Haber*, Y. Solihin n & W. Horne*

author keywords: Encryption; Hardware Security; Phase-Change Memory; Data Protection
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2016 conference paper

Silent shredder: Zero-cost shredding for secure non-volatile main memory controllers

Operating Systems Review, 50(2), 263–276.

By: A. Awad n, P. Manadhata*, S. Haber*, Y. Solihin n & W. Horne*

Source: NC State University Libraries
Added: August 6, 2018

2015 conference paper

Emulating cache organizations on real hardware using performance cloning

Ieee international symposium on performance analysis of systems and, 298–307.

By: Y. Wang n & Y. Solihin n

TL;DR: This paper proposes infusing environment-specific information into the clone, which enables the simulation of hypothetical cache configurations directly on a machine with a different cache configuration, and presents a case study of how page mapping affects cache performance. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2015 article

MeToo: Stochastic Modeling of Memory Traffic Timing Behavior

2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), pp. 457–467.

By: Y. Wang n, Y. Solihin* & G. Balakrishnan n

author keywords: workload cloning; memory subsystem; memory controller; memory bus; DRAM
TL;DR: MeToo is a framework for generating synthetic memory traffic for memory subsystem design exploration that uses a small set of statistics that summarizes the performance behavior of the original applications, and generates synthetic traces or executables stochastically, allowing applications to remain proprietary. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2015 conference paper

Non-volatile memory host controller interface performance analysis in high-performance I/O systems

Ieee international symposium on performance analysis of systems and, 145–154.

By: A. Awad n, B. Kettering* & Y. Solihin n

TL;DR: The system performance bottlenecks and overhead of using the standard state-of-the-art Non-volatile Memory Express (NVMe), or Non-Volatile Memory Host Controller Interface (NVMHCI) Specification as representative for NVM host controller interfaces are investigated. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2014 chapter

Collaborative Memories in Clusters: Opportunities and Challenges

In Transactions on Computational Science XXII (pp. 17–41).

By: A. Samih*, R. Wang, C. Maciocco*, M. Kharbutli* & Y. Solihin n

TL;DR: Highly-integrated distributed systems such as Intel Micro Server and SeaMicro Server are increasingly becoming a popular server architecture, and it is crucial to understand the existing performance bottlenecks, overheads, and potential optimizations. (via Semantic Scholar)
Source: Crossref
Added: February 24, 2020

2014 conference paper

STM : Cloning the spatial and temporal memory access behavior

International symposium on high-performance computer, 237–247.

By: A. Awad n & Y. Solihin n

TL;DR: A new memory access behavior cloning technique that captures both temporal and spatial locality is proposed, abbreviated as Spatio-Temporal Memory (STM) cloning, and a new profiling method and statistics that capture stride patterns and transition probabilities are proposed. (via Semantic Scholar)
UN Sustainable Development Goal Categories
11. Sustainable Cities and Communities (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2012 article

Modeling and Analyzing Key Performance Factors of Shared Memory MapReduce

2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), pp. 1306–1317.

By: D. Tiwari n & Y. Solihin n

TL;DR: An analytical model is built to capture key performance factors of shared memory MapReduce and investigates important performance trends and behavior, and proposes an application classification framework that can be used to reason about performance bottlenecks for a given application. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 conference paper

WEST: Cloning data cache behavior using stochastic traces

International symposium on high-performance computer, 387–398.

By: G. Balakrishnan n & Y. Solihin n

TL;DR: This work proposes Workload Emulation using Stochastic Traces (WEST), a highly accurate black box cloning technique for replicating data cache behavior of arbitrary programs, and generates a clone stochastically that produces statistics identical to the proprietary workload. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2011 conference paper

Architectural framework for supporting operating system survivability

International symposium on high-performance computer, 456–465.

By: X. Jiang n & Y. Solihin n

TL;DR: Through simple but carefully-designed architecture support, this paper provides OS kernel survivability with low performance overheads and when tested with real world security attacks, the survivability mechanism automatically prevents the security faults from corrupting the kernel state or affecting other processes, recovers thekernel state and resumes execution. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2011 journal article

CHOP: INTEGRATING DRAM CACHES FOR CMP SERVER PLATFORMS

IEEE MICRO, 31(1), 99–108.

By: X. Jiang*, N. Madan*, L. Zhao*, M. Upton*, R. Iyer*, S. Makineni*, D. Newell, Y. Solihin n, R. Balasubramonian*

TL;DR: CHOP (Caching Hot Pages) addresses this trade-off between tag space overhead and memory bandwidth consumption through three filter-based DRAM-caching techniques. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2011 journal article

Evaluating Placement Policies for Managing Capacity Sharing in CMP Architectures with Private Caches

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 8(3).

By: A. Samih n, Y. Solihin n & A. Krishna

author keywords: Design; Performance; Measrument; Memory systems; chip multiprocessor; private caches; capacity sharing; placement policies; stack distance profiling; limit studies; QoS
TL;DR: This article designs a simple, predictor-based, scheme called Adaptive Placement Policy (APP) that learns from past cache behavior to make a better decision on whether to place a newly fetched block in the local or remote cache and finds that APP's capacity sharing mechanism increases aggregate performance by 29% on average. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2011 conference paper

HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor

International symposium on high-performance computer, 99–110.

By: S. Lee n, D. Tiwari n, S. Yan n & J. Tuck n

TL;DR: A hardware-accelerated queue, or HAQu, is proposed that adds hardware to a CMP that accelerates operations on software queues, and ensures that the full state of the queue is stored in the application's address space, thereby ensuring virtualization. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2010 chapter

An Analysis of Secure Processor Architectures

In Transactions on Computational Science VII (pp. 101–121).

By: S. Chhabra n, Y. Solihin n, R. Lal* & M. Hoekstra*

TL;DR: Three of the currently proposed secure uniprocessor designs are analyzed in terms of their security, complexity of hardware required and performance overheads: eXecute Only Memory (XOM), Counter mode encryption and Merkle tree based authentication, and Address Independent Seed Encryption and Bonsai MerKle Tree based authentication. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2010 conference paper

CHOP: Adaptive filter-based DRAM caching for CMP server platforms

International symposium on high-performance computer, 233–244.

By: X. Jiang n, N. Madan*, L. Zhao*, M. Upton*, R. Iyer*, S. Makineni*, D. Newell*, Y. Solihin n, R. Balasubramonian*

TL;DR: Detailed simulations with server workloads show that filter-based DRAM caching techniques achieve the following: on average over 30% performance improvement over previous solutions, several magnitudes lower area overhead in tag space required for cache-line based DRAM caches, and significantly lower memory bandwidth consumption as compared to page-granularDRAM caches. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2010 chapter

Green Secure Processors: Towards Power-Efficient Secure Processor Design

In Transactions on Computational Science X (pp. 329–351).

By: S. Chhabra n & Y. Solihin n

TL;DR: This is the first work to examine the power implications of providing hardware mechanisms for security in secure processor architectures and outlines the design of a novel hybrid cryptographic engine that can be used to minimize the power consumption for a secure processor. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2010 conference paper

MMT: Exploiting Fine Grained Parallelism in Dynamic Memory Management

International Parallel and Distributed Processing Symposium.

By: D. Tiwari n, J. Tuck n & Y. Solihin n

TL;DR: It is shown that an efficient MMT design can give significant performance improvement by extracting parallelism while being agnostic to the underlying memory management library algorithms and data structures, and how parallelism provided by MMT can be beneficial for high overhead memory management tasks. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2010 journal article

Quality of Service Shared Cache Management in Chip Multiprocessor Architecture

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 7(3).

By: F. Guo, Y. Solihin n, L. Zhao* & R. Iyer*

author keywords: Design; Performance; Cache; chip multi-processors; CMP; multicore architecture; quality of service; QoS; performance; resource stealing
TL;DR: A framework would be needed to manage the shared cache resource for fully providing QoS in a CMP, and compared to an unoptimized scheme, the throughput can be improved by up to 47%, making the throughput significantly closer to a non-QoS CMP. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2010 conference paper

Understanding how off-chip memory bandwidth partitioning in chip multiprocessors affects system performance

International symposium on high-performance computer, 57–68.

By: F. Liu n, X. Jiang n & Y. Solihin n

TL;DR: This paper proposes a simple yet powerful analytical model that gives it ability to answer several important questions about how off-chip bandwidth partitioning improves system performance and how bandwidth and cache partitioning interact with one another. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2010 journal article

Understanding the Behavior and Implications of Context Switch Misses

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 7(4).

By: F. Liu n & Y. Solihin n

author keywords: Algorithms; Design; Experimentation; Performance; Context switch misses; stack distance profiling; analytical model; prefetching
TL;DR: It is shown that under relatively heavy workloads in the system, the worst-case number of context switch misses an application suffers from tends to increase proportionally with cache sizes, to the extent that may completely negate the reduction in other types of cache misses. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 article

Architecture Support for Improving Bulk Memory Copying and Initialization Performance

18TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, pp. 169-+.

By: X. Jiang n, Y. Solihin n, L. Zhao* & R. Iyer*

author keywords: memory copying; memory initialization; cache affinity; cache neutral; early retirement
TL;DR: This paper proposed FastBCI, an architecture support that achieves the granularity efficiency of a bulk copying/ initialization instruction, but without its pipeline and cache bottlenecks, which on average achieves anywhere between 23% to 32% speedup ratios. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 book

Fundamentals of parallel computer architecture multichip and multicore systems

[United States?]: Solihin Pub.

By: Y. Solihin

Source: NC State University Libraries
Added: August 6, 2018

2009 journal article

MemTracker: An Accelerator for Memory Debugging and Monitoring

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 6(2).

By: G. Venkataramani*, I. Doudalis*, Y. Solihin n & M. Prvulovic*

author keywords: Design; Performance; Reliability; Accelerator; memory access monitoring; debugging
TL;DR: MemTracker's rich set of states, events, and transitions can be used to implement different monitoring and debugging checkers with minimal performance overheads, even when frequent state updates are needed. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 conference paper

Memory Management Thread for Heap Allocation Intensive Applications

Workshop on Memory Performance: Dealing with Applications, Systems Architecture.

By: D. Tiwari n, S. Lee n, J. Tuck n & Y. Solihin n

TL;DR: This paper proposes a way for exploiting multicore parallelism in dynamic memory management for sequential applications, by spinning off memory allocation and deallocation functions to a separate thread that is referred to as memory management thread (MMT). (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2009 journal article

Prefetching with Helper Threads for Loosely Coupled Multiprocessor Systems

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 20(9), 1309–1324.

By: J. Lee*, C. Jung*, D. Lim* & Y. Solihin n

author keywords: Helper thread; prefetching; chip multiprocessors; processing-in-memory system
TL;DR: This paper presents a helper thread prefetching scheme that is designed to work on loosely coupled processors, such as in a standard chip multiprocessor (CMP) system or an intelligent memory system, and is based on a new synchronization mechanism between the application and helper threads. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2008 journal article

Counter-based cache replacement and bypassing algorithms

IEEE TRANSACTIONS ON COMPUTERS, 57(4), 433–447.

By: M. Kharbutli* & Y. Solihin n

author keywords: caches; counter-based algorithms; cache replacement algorithms; cache bypassing; cache misses
TL;DR: A new counter-based approach to deal with cache pollution, predicting lines that have become dead and replacing them early from the L2 cache and identifying never-reaccessed lines, which is augmented with an event counter that is incremented when an event of interest such as certain cache accesses occurs. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2008 book

Fundamentals of parallel computer architecture

[United States?]: Solihin Pub.

By: Y. Solihin

Source: NC State University Libraries
Added: August 6, 2018

2008 article

Making Secure Processors OS- and Performance-Friendly

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, Vol. 5.

By: S. Chhabra n, B. Rogers n, Y. Solihin n & M. Prvulovic*

author keywords: Security; Performance; Design; Secure processor architectures; memory encryption; memory integrity verification; virtualization
TL;DR: AISE+BMT reduces the overhead of prior memory encryption and integrity verification schemes from 12% to 2% on average for single-threaded benchmarks on uniprocessor systems, and from 15% to 4% for coscheduled benchmarks on multicore systems while eliminating critical system-level problems. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2006 article

HeapMon: A helper-thread approach to programmable, automatic, and low-overhead memory bug detection

IBM JOURNAL OF RESEARCH AND DEVELOPMENT, Vol. 50, pp. 261–275.

By: R. Shetty, M. Kharbutli n, Y. Solihin n & M. Prvulovic*

TL;DR: HeapMon is presented, a heap memory bug-detection scheme that has a very low performance overhead, is automatic, and is easy to deploy and relies on two new techniques to safely and significantly reduce bug checking frequency. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2005 journal article

Eliminating conflict misses using prime number-based cache indexing

IEEE TRANSACTIONS ON COMPUTERS, 54(5), 573–586.

By: M. Kharbutli n, Y. Solihin n & J. Lee*

author keywords: cache hashing; cache indexing; prime modulo; odd-multiplier displacement; conflict misses
TL;DR: An in-depth analysis of the pathological behavior of cache hashing functions is presented and two new hashing functions are proposed, prime modulo and odd-multiplier displacement, that are resistant to pathological behavior and yet are able to eliminate the worst-case conflict behavior in the L2 cache are proposed. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2003 journal article

Correlation prefetching with a user-level memory thread

IEEE Transactions on Parallel and Distributed Systems, 14(6), 563–580.

By: Y. Solihin*, J. Lee & J. Torrellas*

TL;DR: This paper proposes using a user-level memory thread (ULMT) for correlation prefetching, and shows that this approach has wide applicability, as it can effectively prefetch even for irregular applications, and works well in combination with a conventional processor-side sequential prefetcher. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.