Works (67)

Updated: April 5th, 2024 03:29

2021 article

Design for 3D Stacked Circuits

2021 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM).

By: P. Franzon n, W. Davis n, E. Rotenberg n, J. Stevens n, S. Lipa n, T. Nigussie n, H. Pan n, L. Baker n ...

TL;DR: 2.5D and 3D technologies can give rise to a node equivalent of scaling due to improved connectivity because of improved connectivity, but design issues that need to be addressed in pursuing such exploitations include thermal management, design for test and computer aided design. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Sources: Web Of Science, ORCID
Added: July 11, 2022

2020 journal article

Post-Silicon Microarchitecture

IEEE COMPUTER ARCHITECTURE LETTERS, 19(1), 26–29.

By: C. Kumar n, A. Chaudhary n, S. Bhawalkar n, U. Mathur n, S. Jain n, A. Vastrad n, E. Rotenberg n

author keywords: Microarchitecture; Payloads; Fabrics; Indexes; Prefetching; Registers; Synchronization; Adaptable architectures; microarchitecture; reconfigurable hardware
TL;DR: This work proposes coupling a reconfigurable fabric with the CPU, on the same chip, via a simple and flexible interface to allow post-silicon development of application-specific microarchitectures. (via Semantic Scholar)
Source: Web Of Science
Added: May 8, 2020

2020 article

Slipstream Processors Revisited: Exploiting Branch Sets

2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), pp. 105–117.

By: V. Srinivasan n, R. Chowdhury* & E. Rotenberg n

author keywords: branch prediction; prefetching; hard-to-predict branch; delinquent load; pre-execution; helper threads; control independence
TL;DR: The objective of this paper is to design a new pre-execution microarchitecture that meets four criteria: (i) retains the simpler coordination of a leader-follower microarch architecture, (ii) is fully automated with just hardware, (iii) targets both branches and loads, and is effective. (via Semantic Scholar)
Source: Web Of Science
Added: March 8, 2021

2017 conference paper

A case for standard-cell based RAMs in highly-ported superscalar processor structures

Proceedings of the eighteenth international symposium on quality electronic design (isqed), 131–137.

By: S. Ku n, E. Forbes*, R. Chowdhury* & E. Rotenberg*

TL;DR: This paper introduces a standard-cell memory compiler with three key features: (i) per-row clock gating, (ii) a new tri-state based mux standard cell, and (iii) a modular layout strategy, which is the centerpiece of the memory compiler. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2017 conference paper

H3 (heterogeneity in 3D): A logic-on-logic 3D-stacked heterogeneous multi-core processor

2017 IEEE International Conference on Computer Design (ICCD), 145–152.

By: V. Srinivasan n, R. Chowdhury*, E. Forbes*, R. Widialaksono*, Z. Zhang*, J. Schabel n, S. Ku*, S. Lipa n ...

Event: 2017 IEEE International Conference on Computer Design (ICCD) at Boston, MA on November 5-8, 2017

TL;DR: The H3 chip is presented, that uses 3D die stacking and novel microarchitecture to implement a heterogeneous multi-core processor (HMP) with low-latency fast thread migration capabilities and can reduce power consumption of benchmarks by up to 26%. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: Web Of Science, ORCID
Added: August 6, 2018

2016 conference paper

AnyCore-1: A comprehensively adaptive 4-way superscalar processor

2016 ieee hot chips 28 symposium (hcs).

By: R. Chowdhury n, A. Kannepalli n & E. Rotenberg n

TL;DR: This article consists only of a collection of slides from the author's conference presentation. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2016 conference paper

AnyCore: A synthesizable RTL model for exploring and fabricating adaptive superscalar cores

Ieee international symposium on performance analysis of systems and, 214–224.

By: R. Chowdhury n, A. Kannepalli n, S. Ku n & E. Rotenberg n

TL;DR: A register-transfer-level (RTL) design of a highly adaptive superscalar core, called AnyCore, which can be used to quantify logic overheads of an adaptive core with respect to fixed cores, synthesize and compare different adaptive cores, and fabricate adaptive supersCalar cores. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2016 conference paper

Fast register consolidation and migration for heterogeneous multi-core processors

Proceedings of the 34th ieee international conference on computer design (iccd), 1–8.

By: E. Forbes n & E. Rotenberg n

TL;DR: This paper investigates the impact that thread migrations impose on single-ISA heterogeneous systems and suggests that a high-cost thread migration requires infrequent migrations, as the migration penalty must be amortized. (via Semantic Scholar)
UN Sustainable Development Goal Categories
10. Reduced Inequalities (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2016 conference paper

Physical design of a 3D-stacked heterogeneous multi-core processor

2016 IEEE International 3D Systems Integration Conference (3DIC). Presented at the 2016 IEEE International 3D Systems Integration Conference (3DIC), -San Francisco, CA.

By: R. Widialaksono n, R. Basu Roy Chowdhury n, Z. Zhang n, J. Schabel n, S. Lipa n, E. Rotenberg n, W. Rhett Davis, P. Franzon n

Event: 2016 IEEE International 3D Systems Integration Conference (3DIC) at -San Francisco, CA on November 8-11, 2016

TL;DR: This paper presents a 3D-SIC physical design methodology for a multi-core processor using commercial off-the-shelf tools and indicates an order of magnitude decrease in wirelengths for critical inter-core components in the 3D implementation compared to 2D implementations. (via Semantic Scholar)
Sources: Crossref, ORCID, NC State University Libraries
Added: March 24, 2019

2015 journal article

Control-Flow Decoupling: An Approach for Timely, Non-Speculative Branching

IEEE TRANSACTIONS ON COMPUTERS, 64(8), 2182–2203.

By: R. Sheikh*, J. Tuck n & E. Rotenberg n

author keywords: Microarchitecture; software/hardware codesign; branch prediction; predication; pre-execution; separable branches; isa extensions; instruction level parallelism
TL;DR: It is found that a third of mispredictions-per-1K-instructions (MPKI) come from what the authors call separable branches: branches with large control-dependent regions (not suitable for if-conversion), whose backward slices do not depend on their control- dependent instructions or have only a short dependence. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2014 conference paper

Co-simulation framework for streamlining microprocessor development on standard ASIC design flow

2014 19th asia and south pacific design automation conference (asp-dac), 400–405.

By: T. Nakabayashi*, T. Sugiyama*, T. Sasaki*, E. Rotenberg n & T. Kondo*

TL;DR: This paper presents a practical processor co-simulation framework for not only RTL simulation but also gate/transistor level simulation, and even chip evaluation with an LSI tester, and proposes a cache warming mechanism when resuming from a checkpoint. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2014 conference paper

Design-effort alloy: Boosting a highly tuned primary core with untuned alternate cores

2014 IEEE 32nd International Conference on Computer Design (ICCD). Presented at the 2014 32nd IEEE International Conference on Computer Design (ICCD).

By: E. Forbes n, N. Choudhary n, B. Dwiel n & E. Rotenberg n

Event: 2014 32nd IEEE International Conference on Computer Design (ICCD)

TL;DR: This paper proposes a new class of single-ISA heterogeneous multi-core processor, called design-effort alloy (DEA), which has more than a 2x frequency advantage with only a 1.3× increase in energy consumption compared to their corresponding LECs. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Crossref
Added: June 15, 2019

2013 conference paper

A Unified View of Non-monotonic Core Selection and Application Steering in Heterogeneous Chip Multiprocessors

Proceedings of the 22nd IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques (PACT-22), 133–144.

By: S. Navada, N. Choudhary, S. Wadhavkar & E. Rotenberg

Source: NC State University Libraries
Added: July 28, 2019

2013 conference paper

Design of controller for L2 cache mapped in Tezzaron stacked DRAM

2013 IEEE International 3D Systems Integration Conference (3DIC). Presented at the 2013 IEEE International 3D Systems Integration Conference (3DIC), San Francisco, CA.

By: N. Tshibangu n, P. Franzon n, E. Rotenberg n & W. Davis n

Event: 2013 IEEE International 3D Systems Integration Conference (3DIC) at San Francisco, CA on October 2-4, 2013

TL;DR: This paper investigates the implementation of such a cache controller using 3-layer 256 MB Tezzaron Octopus stacked DRAM, which provides a fast data access through burst-4 and burst-8 mode and has a low hit latency. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 conference paper

Hetero(2) 3d integration: A scheme for optimizing efficiency/cost of chip multiprocessors

Proceedings of the fourteenth international symposium on quality electronic design (ISQED 2013), 1–7.

By: S. Priyadarshi n, N. Choudhary n, B. Dwiel n, A. Upreti n, E. Rotenberg n, R. Davis n, P. Franzon n

Event: International Symposium on Quality Electronic Design (ISQED) at Santa Clara, CA on March 4-6, 2013

TL;DR: This work proposes exploiting two complementary forms of heterogeneity to profitably exploit an immature technology for Chip Multiprocessors (CMP): 3D integration facilitates a technology alloy and application and microarchitectural heterogeneity is exploited to compensate for lower efficiency of old-technology cores. (via Semantic Scholar)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2013 conference paper

Rationale for a 3D heterogeneous multi-core processor

2013 IEEE 31st International Conference on Computer Design (ICCD), 154–168.

By: E. Rotenberg n, B. Dwiel n, E. Forbes n, Z. Zhang n, R. Widialaksono n, R. Chowdhury n, N. Tshibangu n, S. Lipa n ...

Event: 2013 IEEE 31st International Conference on Computer Design (ICCD) at Asheville, NC on October 6-9, 2013

TL;DR: Single-ISA heterogeneous multi-core processors are comprised of multiple core types that are functionally equivalent but microarchitecturally diverse. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: Crossref, ORCID, NC State University Libraries
Added: March 24, 2019

2012 conference paper

A physical design study of fabscalar-generated superscalar cores

2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC). Presented at the 2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC).

By: N. Choudhary n, B. Dwiel n & E. Rotenberg n

Event: 2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC)

Source: Crossref
Added: June 15, 2019

2012 article

Control-Flow Decoupling

2012 IEEE/ACM 45TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-45), pp. 329–340.

By: R. Sheikh n, J. Tuck n & E. Rotenberg n

TL;DR: This work proposes control-flow decoupling (CFD) to eradicate mispredictions of separable branches, and considers whether CFD is a necessary catalyst for future complexity-effective large-window architectures to tolerate memory latency. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2012 journal article

FABSCALAR: AUTOMATING SUPERSCALAR CORE DESIGN

IEEE MICRO, 32(3), 48–59.

By: N. Choudhary n, S. Wadhavkar n, T. Shah n, H. Mayukh n, J. Gandhi n, B. Dwiel n, S. Navada n, H. Najaf-Abadi n, E. Rotenberg n

TL;DR: FabScalar aims to automate superscalar core design, opening up processor design to microarchitectural diversity and its many opportunities. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2012 conference paper

FPGA modeling of diverse superscalar processors

2012 IEEE International Symposium on Performance Analysis of Systems & Software. Presented at the 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

By: B. Dwiel n, N. Choudhary n & E. Rotenberg n

Event: 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS)

TL;DR: FPGA-Sim is described, a configurable, automatically FGPA-synthesizable, and register-transfer-level (RTL) model of an out-of-order superscalar processor that enables FPGA modeling of diverse superscalars out- of-the-box. (via Semantic Scholar)
Source: Crossref
Added: June 15, 2019

2012 conference paper

Research for Transporting Alpha ISA and Adopting Multi-processor to FabScalar

Proceedings of the Symposium on Advanced Computing Systems and Infrastructures 2012 (SACSIS 2012), 374–381.

By: T. Nakabayashi, T. Sasaki, E. Rotenberg, K. Ohno & T. Kondo

Source: NC State University Libraries
Added: July 10, 2019

2011 journal article

FabScalar: Composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template

ISCA 2011: Proceedings of the 38th Annual International Symposium on Computer Architecture, 11–22.

By: N. Choudhary n, S. Wadhavkar n, T. Shah*, H. Mayukh*, J. Gandhi*, B. Dwiel n, S. Navada n, H. Najaf-abadi*, E. Rotenberg n

TL;DR: From this idea, a toolset is developed, called FabScalar, for automatically composing the synthesizable register-transfer-level (RTL) designs of arbitrary cores within a canonical superscalar template, which defines canonical pipeline stages and interfaces among them. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2010 article

Criticality-driven Superscalar Design Space Exploration

PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, pp. 261–272.

By: S. Navada n, N. Choudhary n & E. Rotenberg n

author keywords: design space exploration; criticality model; bottleneck analysis; superscalar processors; simulated annealing
TL;DR: It has become increasingly difficult to perform design space exploration (DSE) of computer systems with a short turnaround time because of exploding design spaces, increasing design complexity and long-running workloads. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2010 conference paper

EXACT: Explicit Dynamic-Branch Prediction with Active Updates

Proceedings of the 7th ACM international conference on Computing frontiers - CF '10, 165–176.

By: M. Al-Otoom n, E. Forbes n & E. Rotenberg n

Event: the 7th ACM international conference

author keywords: branch prediction; superscalar processors; microarchitecture
TL;DR: It is proposed that stores to the memory addresses on which a dynamic branch depends, directly update its prediction in the predictor, and this novel "active update" concept avoids mispredictions that are otherwise incurred by conventional passive training. (via Semantic Scholar)
Source: Crossref
Added: July 28, 2019

2009 conference paper

Architectural Contesting

2009 IEEE 15th International Symposium on High Performance Computer Architecture. Presented at the 2009 IEEE 15th International Symposium on High Performance Computer Architecture (HPCA).

By: H. Najaf-abadi n & E. Rotenberg n

Event: 2009 IEEE 15th International Symposium on High Performance Computer Architecture (HPCA)

TL;DR: Results are presented showing that workload behavior tends to vary considerably at granularities of less than a thousand instructions, and if it were possible to adjust the microarchitecture to suit the workload behavior at such rates, significant single-thread performance enhancement would be achievable. (via Semantic Scholar)
Source: Crossref
Added: June 21, 2019

2009 article

Core-Selectability in Chip Multiprocessors

18TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, pp. 113–122.

By: H. Najaf-abadi n, N. Choudhary n & E. Rotenberg n

author keywords: Chip Multiprocessor; Heterogeneity; Microarchitecture
TL;DR: This paper proposes core-selectability – incorporating differently-designed cores that can be toggled into active employment that enables differently customized ILP-extracting structures to be at hand in the system while not dramatically adding to the interconnect complexity. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 article

The Importance of Accurate Task Arrival Characterization in the Design of Processing Cores

PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, pp. 75–85.

By: H. Najaf-abadi n & E. Rotenberg n

TL;DR: A stochastic characterization is formulated that defines regularity in the task arrival pattern that is used as the basis for a quantitative evaluation of the importance of accurately accounting for the task departure behavior in the design of the processing cores of a Chip Multi-processor (CMP). (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2008 conference paper

Configurational Workload Characterization

ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software. Presented at the Software (ISPASS).

By: H. Najaf-abadi n & E. Rotenberg n

Event: Software (ISPASS)

author keywords: single-thread performance; customization; heterogeneous CMP; design exploration; workload characterization
TL;DR: It is shown that the design parameters of the customized processor configurations, what the authors refer to as the configurational characteristics, can yield a more accurate indication of the best way to partition the workload space for the cores of a heterogeneous system to be customized to. (via Semantic Scholar)
Source: Crossref
Added: June 21, 2019

2008 conference paper

Coverage of a microarchitecture-level fault check regimen in a superscalar processor

2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN). Presented at the 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

By: V. Reddy* & E. Rotenberg n

Event: 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN)

TL;DR: It is shown for the first time that the regimen-based approach provides substantial coverage of an entire superscalar processor. (via Semantic Scholar)
Source: Crossref
Added: June 21, 2019

2007 conference paper

Inherent Time Redundancy (ITR): Using Program Repetition for Low-Overhead Fault Tolerance

37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). Presented at the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

By: V. Reddy n & E. Rotenberg n

Event: 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07)

TL;DR: This paper uses ITR to detect transient faults in the fetch and decode units of a processor pipeline, avoiding costly approaches like structural duplication or explicit time redundant execution. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Crossref
Added: June 21, 2019

2007 conference paper

Transparent control independence (TCI)

Proceedings of the 34th annual international symposium on Computer architecture - ISCA '07. Presented at the the 34th annual international symposium.

By: A. Al-Zawawi n, V. Reddy n, E. Rotenberg n & H. Akkary*

Event: the 34th annual international symposium

TL;DR: Transparent control independence (TCI) yields a highly streamlined pipeline that quickly recycles resources based on conventional speculation, enabling a large window with small cycle-critical resources, and prevents many mispredictions from disrupting this large window. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Source: Crossref
Added: June 21, 2019

2007 journal article

ZettaRAM: A power-scalable DRAM alternative through charge-voltage decoupling

IEEE TRANSACTIONS ON COMPUTERS, 56(2), 147–160.

By: R. Venkatesan*, A. Al-Zawawi n, K. Sivasubramanian* & E. Rotenberg n

author keywords: DRAM; dynamic voltage scaling; low-power memory; molecular electronics; molecular memory; memory technology
TL;DR: This work proposes dynamically modulating the padding based on criticality of memory requests, further extending ZettaRAM's energy advantage with negligible system slowdown and extracts energy savings from six otherwise uncompetitive molecules. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2006 conference paper

Assertion-Based Microarchitecture Design for Improved Fault Tolerance

2006 International Conference on Computer Design. Presented at the 2006 International Conference on Computer Design.

By: V. Reddy n, A. Al-Zawawi n & E. Rotenberg n

Event: 2006 International Conference on Computer Design

TL;DR: This work proposes a novel class of targeted fault checks that verify the functioning of the microarchitecture itself, as opposed to the broader challenge of verifying overall architectural correctness of a running program. (via Semantic Scholar)
Source: Crossref
Added: June 21, 2019

2006 journal article

FAST: Frequency-Aware Static Timing Analysis

ACM Transactions on Programming Languages and Systems, 5(1), 200–224.

By: K. Seth, A. Anantaraman, F. Mueller & E. Rotenberg

Source: NC State University Libraries
Added: August 6, 2018

2006 journal article

Non-uniform program analysis & repeatable execution constraints: Exploiting out-of-order processors in real-time systems

Non-uniform program analysis & repeatable execution constraints: Exploiting out-of-order processors in real-time systems. SIGBED Review, 3(1).

By: A. Anantaraman n & E. Rotenberg n

TL;DR: The objective of this paper is to enable easy, tight, and safe timing analysis of contemporary complex processors by exploiting the fact that out-of-order processors can be analyzed via simulation in the absence of variable control-flow. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2006 article

Retention-aware placement in DRAM (RAPID): Software methods for quasi-non-volatile DRAM

TWELFTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, pp. 157-+.

By: R. Venkatesan n, S. Herr n & E. Rotenberg n

TL;DR: This work proposes retention-aware placement in DRAM (RAPID), novel software approaches that can exploit off-the-shelf DRAMs to reduce refresh power to vanishingly small levels approaching non-volatile memory. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2006 patent

Systems, methods and devices for providing variable-latency write operations in memory devices

Washington, DC: U.S. Patent and Trademark Office.

By: E. Rotenberg, R. Venkatesan & A. Al-Zawawi

Source: NC State University Libraries
Added: August 6, 2018

2006 conference paper

The State of ZettaRAM

2006 1st International Conference on Nano-Networks and Workshops. Presented at the 2006 1st International Conference on Nano-Networks and Workshops.

By: E. Rotenberg n & R. Venkatesan*

Event: 2006 1st International Conference on Nano-Networks and Workshops

TL;DR: Key properties of the core technology include flexibility and precision through molecular engineering, self-assembly, scalability through charge-voltage decoupling, and multiple discrete states, and mixed molecules. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Crossref
Added: June 21, 2019

2006 conference paper

Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems - ASPLOS-XII. Presented at the the 12th international conference.

By: V. Reddy n, E. Rotenberg n & S. Parthasarathy*

Event: the 12th international conference

TL;DR: This paper attempts to better understand Slipstream's fault tolerance, conjecturing that the mixture of partial duplication and confident predictions actually closely approximates the coverage of full duplication, and proposes and evaluates a suite of simple microarchitectural alterations to recovery and checking. (via Semantic Scholar)
Source: Crossref
Added: June 21, 2019

2005 chapter

Architecture of embedded microprocessors

In W. Wolf & A. Jerraya (Eds.), Multiprocessor systems on chips (pp. 81–112).

By: E. Rotenberg* & A. Anantaraman*

Ed(s): W. Wolf & A. Jerraya

TL;DR: This chapter reviews the reasons for the parallel evolution of embedded and desktop processors and reasons for dual tracks targeting open versus closed embedded systems—these systems constrain microarchitectural evolution due to the need for timing predictability. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2005 article

Tapping ZettaRAM (TM) for low-power memory systems

11TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, pp. 83–94.

By: R. Venkatesan n, A. Al-Zawawi n & E. Rotenberg n

TL;DR: This work looks beyond ZettaRAM's manufacturing benefits, and approaches it from an architectural viewpoint to discover benefits within the domain of architectural metrics, and applies architectural insights to tap the full extent of Zetta RAM's power savings without compromising performance. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2005 chapter

Trace caches

In D. Kaeli & P.-C. Yew (Eds.), Speculative execution in high performance computer architectures.

By: E. Rotenberg*

Ed(s): D. Kaeli & P. Yew

Source: NC State University Libraries
Added: August 6, 2018

2005 patent

Variable-persistence molecular memory devices and methods of operation thereof

Washington, DC: U.S. Patent and Trademark Office.

By: E. Rotenberg & J. Lindsey

Source: NC State University Libraries
Added: August 6, 2018

2005 conference paper

Virtual multiprocessor: An analyzable, high-performance microarchitecture for real-time computing

CASES 2005: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, September 24-27, 2005, San Francisco, California, USA, 213–224.

By: A. El-Haj-Mahmoud n, A. Al-Zawawi n, A. Anantaraman n & E. Rotenberg n

TL;DR: The novel Real-time Virtual Multiprocessor (RVMP) successfully combines the analyzability of multiple processors with the flexibility of simultaneous multithreading (SMT) to provide a real-time formalism that SMT does not currently provide. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2004 journal article

A simple mechanism for detecting ineffectual instructions in slipstream processors

IEEE TRANSACTIONS ON COMPUTERS, 53(4), 399–413.

By: J. Koppanalil* & E. Rotenberg n

author keywords: microarchitecture; multithreading; chip multiprocessor; slipstream; preexecution
TL;DR: This work observes that, by logically monitoring the speculative program (instead of the original program), back-propagation can be reduced to detecting unreferenced writes, and proposes a new algorithm that eliminates complex hardware and achieves an average performance improvement of 11.8 percent. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2004 article

Enforcing safety of real-time schedules on contemporary processors using a virtual simple architecture (VISA)

25TH IEEE INTERNATIONAL REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, pp. 114–125.

By: A. Anantaraman n, K. Seth*, E. Rotenberg n & F. Mueller n

TL;DR: A VISA variant is proposed that dynamically accrues the slack needed to facilitate speculation in the complex mode, eliminating the need to statically pad WCETs and thereby enabling VISA-style speculation even in highly-utilized systems. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2004 conference paper

Safely exploiting multithreaded processors to tolerate memory latency in real-time systems

CASES 2004: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, September 22-25, 2004, Washington, DC, USA, 2–13.

By: A. El-Haj-Mahmoud n & E. Rotenberg n

TL;DR: This is the first work to provide the necessary formalism for safely and tractably exploiting coarse-grain multithreaded processors to tolerate memory latency in hard-real-time systems, exceeding the schedulability limits of classic real-time theory for uniprocessors. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2003 journal article

Adaptive mode control: A static-power-efficient cache design

ACM Transactions on Embedded Computing Systems, 2(3), 347–372.

By: Huiyang, M. Toburen n, E. Rotenberg n & T. Conte n

UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2003 article

FAST: Frequency-aware static timing analysis

RTSS 2003: 24TH IEEE INTERNATIONAL REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, pp. 40–51.

By: K. Seth n, A. Anantaraman n, F. Mueller n & E. Rotenberg n

TL;DR: Novel techniques for tight and flexible static timing analysis particularly well-suited for dynamic scheduling schemes are contributed, including a parametric approach towards bounding the WCET statically with respect to the frequency and an improved parametric model for improving existing DVS scheduling schemes. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2003 article

Slipstream execution mode for CMP-based multiprocessors

NINTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, Vol. 12, pp. 179–190.

By: K. Ibrahim n, G. Byrd n & E. Rotenberg n

Contributors: K. Ibrahim n, G. Byrd n & E. Rotenberg n

TL;DR: This work proposes an additional mode of execution, called slipstream mode, that instead enlists extra processors to assist parallel tasks by reducing perceived overheads, and yields two benefits, including a detailed picture of future reference behavior, enabling a number of optimizations aimed at accelerating coherence events, e.g., self-invalidation. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: August 6, 2018

2003 conference paper

Virtual Simple Architecture (VISA): Exceeding the complexity limit in safe real-time systems

Computers and their applications :|bproceedings of the ISCA 16th International Conference, Seattle, Washington, USA, March 28-30, 2001, 350–361. Cary, NC: ISCA.

By: A. Anantaraman, K. Seth, K. Patil, E. Rotenberg & F. F. Mueller

Source: NC State University Libraries
Added: August 6, 2018

2002 conference paper

A case for dynamic pipeline scaling

Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems|h: 2002, Greenoble, France, October 08-11, 2002, 1–8.

By: J. Koppanalil n, P. Ramrakhyani n, S. Desai n, A. Vaidyanathan n & E. Rotenberg n

TL;DR: This paper makes the case that the useful frequency range of DVS is limited because there is a lower bound on voltage, and proposes Dynamic Pipeline Scaling (DPS), a DPS-enabled deep pipeline that has a deep mode for higher frequencies within the influ¿ence of D VS, and a shallow mode for lower frequencies. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2002 conference paper

A large, fast instruction window for tolerating cache misses

29th Annual International Symposium on Computer Architecture: Proceedings : 25-29 May, 2002, Anchorage, Alaska, 59–70.

By: A. Lebeck*, J. Koppanalil n, T. Li*, J. Patwardhan* & E. Rotenberg n

TL;DR: Simulations reveal that, for an 8-way processor, a 2K-entry WIB with a 32-entry issue queue can achieve speedups of 20, 84%, and 50% over a conventional 32- entry issue queue for a subset of the SPEC CINT2000, SPEC CFP2000, and Olden benchmarks, respectively. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2001 conference paper

Adaptive mode control: A static-power-efficient cache design

2001 International Conference on Parallel Architectures and Compilation Techniques: Proceedings: 8-12 September, 2001, Barcelona, Catalunya, Spain, 61–70.

By: Huiyang, M. Toburen n, E. Rotenberg n & T. Conte n

TL;DR: Simulations show an average of 73% of I-cache lines and 54% of D-cache lines are put in sleep mode with an average IPC impact of only 1.7%, for 64KB caches, and this work proposes applying sleep mode only to the data store and not the tag store. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2001 chapter

Trace caching and trace processors

In Computer engineering handbook (pp. 8–45). Boca Raton, FL: CRC Press.

By: E. Rotenberg

Source: NC State University Libraries
Added: August 6, 2018

2001 article

Using variable-MHz microprocessors to efficiently handle uncertainty in real-time systems

34TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO-34, PROCEEDINGS, pp. 28–39.

By: E. Rotenberg n

TL;DR: This work proposes using microarchitecture simulation to produce accurate but not guaranteed-correct worst-case performance bounds, and proposes using frequency reserves to guarantee the final deadline is met in spite of interim failures. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2000 conference paper

A study of slipstream processors

Proceedings: 33rd Annual IEEE/ACM International Symposium on Microarchitecture: Monterey, California, USA, 10-13 December 2000, 269–280.

By: Z. Purser n, K. Sundaramoorthy n & E. Rotenberg n

Source: NC State University Libraries
Added: August 6, 2018

2000 journal article

Control independence in trace processors

Journal of Instruction-Level Parallelism, 2, 63–85.

By: E. Rotenberg & J. Smith

Source: NC State University Libraries
Added: August 6, 2018

2000 conference paper

Slipstream processors: Improving both performance and fault tolerance

ASPLOS-IX proceedings: Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, Massachusetts, November 12-15, 2000, 257–268.

By: K. Sundaramoorthy n, Z. Purser n & E. Rotenberg

Source: NC State University Libraries
Added: August 6, 2018

1999 article

A study of control independence in superscalar processors

FIFTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, pp. 115–124.

By: E. Rotenberg*, Q. Jacobson* & J. Smith*

TL;DR: It is shown that much of the performance potential of control independence is lost due to data dependences and wasted resources consumed by incorrect control dependent instructions, but even so, control independence can close the performance gap between real and perfect branch prediction by as much as half. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

1999 journal article

A trace cache microarchitecture and evaluation

IEEE TRANSACTIONS ON COMPUTERS, 48(2), 111–120.

By: E. Rotenberg*, S. Bennett* & J. Smith*

author keywords: instruction cache; instruction fetching; multiple branch prediction; superscalar processors; trace cache
TL;DR: A microarchitecture incorporating a trace cache provides high instruction fetch bandwidth with low latency by explicitly sequencing through the program at the higher level of traces, both in terms of control flow prediction and instruction supply. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

1999 conference paper

AR-SMT: A microarchitectural approach to fault tolerance in microprocessors

Digest of papers: Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing: June 15-18, 1999, Madison, Wisconsin, USA, 84–91.

By: E. Rotenberg*

TL;DR: A new time redundancy fault-tolerant approach in which a program is duplicated and the two redundant programs simultaneously run on the processor: the technique exploits several significant microarchitectural trends to provide broad coverage of transient faults and restricted coverage of permanent faults. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

1999 article

Control independence in trace processors

32ND ANNUAL INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, (MICRO-32), PROCEEDINGS, pp. 4–15.

By: E. Rotenberg n & J. Smith*

TL;DR: A trace processor microarchitecture is developed to exploit control independence and thereby reduce branch misprediction penalties and improves trace processor performance from 5% to 25%, and 17% on average. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

1997 article

Path-based next trace prediction

THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, pp. 14–23.

By: Q. Jacobson*, E. Rotenberg* & J. Smith*

TL;DR: A next trace predictor is proposed that treats the traces as basic units and explicitly predicts sequences of traces, and yields about a 26% reduction in misprediction rates when compared with the most aggressive previously proposed, multiple branch prediction methods. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

1997 article

Trace processors

THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, pp. 138–148.

By: E. Rotenberg*, Q. Jacobson*, Y. Sazeides* & J. Smith*

TL;DR: The results affirm that significant instruction-level parallelism can be exploited in integer programs (2 to 6 instructions per cycle) and quantify the value of successively doubling the number of distributed elements. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

1996 article

Assigning confidence to conditional branch predictions

PROCEEDINGS OF THE 29TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE - MICRO-29, pp. 142–152.

By: E. Jacobsen*, E. Rotenberg* & J. Smith*

TL;DR: This work studies idealized dynamic confidence methods using both one and two levels of branch correctness history and finds that the single level method performs at least as well as the more complex two level method and is able to isolate 89 percent of the mispredictions into a set of low confidence dynamic branches. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

1996 article

Trace cache: A low latency approach to high bandwidth instruction fetching

PROCEEDINGS OF THE 29TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE - MICRO-29, pp. 24–34.

By: E. Rotenberg*, S. Bennett* & J. Smith*

TL;DR: It is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.