Norman Jouppi
American computer scientist
Norman Jouppi's AcademicInfluence.com Rankings
Download Badge
Computer Science
Norman Jouppi's Degrees
- PhD Computer Science University of California, Berkeley
- Masters Computer Science University of California, Berkeley
Similar Degrees You Can Earn
Why Is Norman Jouppi Influential?
(Suggest an Edit or Addition)According to Wikipedia, Norman Paul Jouppi is an American electrical engineer and computer scientist. Career Jouppi was one of the computer architects at the MIPS Stanford University Project , an early RISC project. He received his master's degree in electrical engineering from Northwestern University in 1980 and was awarded a PhD in 1984 from Stanford University. In 1984 he joined Digital Equipment Corporation's Western Research Laboratory. He worked at Compaq and at Hewlett-Packard in 2002, where he ran the Advanced Architecture Lab at HP Labs in Palo Alto from 2006 to 2008 and then the Exascale Computing Lab from 2008 to 2010 and the Intelligent Infrastructure Lab from 2010 to 2011. After that, he became a computer engineer at Google.
Norman Jouppi's Published Works
Published Works
- In-datacenter performance analysis of a tensor processing unit (2017) (3601)
- McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures (2009) (2453)
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers (1990) (1454)
- NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory (2012) (1003)
- Complexity-Effective Superscalar Processors (1997) (922)
- CACTI: an enhanced cache access and cycle time model (1996) (859)
- Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction (2003) (854)
- CACTI 6.0: A Tool to Model Large Caches (2009) (842)
- Cacti 3. 0: an integrated cache timing, power, and area model (2001) (803)
- Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 (2007) (727)
- Corona: System Implications of Emerging Nanophotonic Technology (2008) (684)
- Single-ISA heterogeneous multi-core architectures for multithreaded workload performance (2004) (668)
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre (1990) (612)
- WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches (1994) (423)
- Available instruction-level parallelism for superscalar and superpipelined machines (1989) (392)
- Heterogeneous chip multiprocessors (2005) (383)
- Reconfigurable caches and their application to media processing (2000) (294)
- The multicluster architecture: reducing cycle time through partitioning (1997) (284)
- The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays (2002) (283)
- A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies (2008) (258)
- Cache Write Policies And Performance (1993) (256)
- Core architecture optimization for heterogeneous chip multiprocessors (2006) (252)
- Rethinking DRAM design and organization for energy-constrained multi-cores (2010) (245)
- Design implications of memristor-based RRAM cross-point structures (2011) (235)
- Kiln: Closing the performance gap between systems with and without persistence support (2013) (232)
- CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques (2011) (216)
- Tradeoffs in two-level on-chip caching (1994) (205)
- The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing (2013) (195)
- Configurable isolation: building high availability systems with commodity multi-core processors (2007) (188)
- CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory (2012) (185)
- CACTI 6 . 0 : A Tool to Understand Large Caches (2007) (180)
- FREE-p: Protecting non-volatile memory against both hard and soft errors (2011) (179)
- Performance of image and video processing with general-purpose processors and media ISA extensions (1999) (177)
- A Simulation Based Study of TLB Performance (1992) (172)
- Devices and architectures for photonic chip-scale integration (2009) (165)
- CACTI 2.0: An Integrated Cache Timing and Power Model (2002) (154)
- Computer technology and architecture: an evolving interaction (1991) (154)
- CACTI 2.0: An Integrated Cache Timing and Power Model (2002) (154)
- Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems (2009) (150)
- A domain-specific supercomputer for training deep neural networks (2020) (141)
- Complexity/performance tradeoffs with non-blocking loads (1994) (139)
- Register file design considerations in dynamically scheduled processors (1996) (136)
- Motivation for and Evaluation of the First Tensor Processing Unit (2018) (136)
- Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support (2010) (133)
- Timing Analysis and Performance Improvement of MOS VLSI Designs (1987) (129)
- Quantifying the Complexity of Superscalar Processors (2002) (126)
- Understanding the trade-offs in multi-level cell ReRAM memory design (2013) (123)
- Memory-system Design Considerations For Dynamically-scheduled Processors (1997) (122)
- i2WAP: Improving non-volatile cache lifetime by reducing inter- and intra-set write variations (2013) (122)
- Future scaling of processor-memory interfaces (2009) (118)
- McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling (2013) (118)
- PCRAMsim: System-level performance, energy, and area modeling for Phase-Change RAM (2009) (112)
- Conjoined-Core Chip Multiprocessing (2004) (111)
- An Integrated Cache Timing and Power Model (2002) (110)
- An Integrated Cache Timing and Power Model (2002) (110)
- A Nanophotonic Interconnect for High-Performance Many-Core Computation (2008) (110)
- Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product (2021) (108)
- MIPS: a VLSI processor architecture (1981) (107)
- A domain-specific architecture for deep neural networks (2018) (106)
- The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance (1989) (106)
- Timing Analysis for nMOS VLSI (1983) (100)
- Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures (2003) (98)
- Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories (2014) (95)
- LOT-ECC: Localized and tiered reliability mechanisms for commodity memory systems (2012) (93)
- Readings in computer architecture (2000) (93)
- MIPS: A microprocessor architecture (1982) (90)
- Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs (2009) (90)
- Hardware/software tradeoffs for increased performance (1982) (90)
- The role of optics in future high radix switch design (2011) (89)
- Design trade-offs for high density cross-point resistive memory (2012) (88)
- Architecting Efficient Interconnects for Large Caches with CACTI 6.0 (2008) (88)
- Z3: an economical hardware technique for high-quality antialiasing and transparency (1999) (81)
- Combining memory and a controller with photonics through 3D-stacking to enable scalable and energy-efficient systems (2011) (73)
- The MIPS Machine (1982) (71)
- How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors? (1995) (71)
- A unified vector/scalar floating-point architecture (1989) (69)
- TV: An nMOS Timing Analyzer (1983) (69)
- Architectural And Organizational Tradeoffs In The Design Of The Multititan CPU (1989) (68)
- Feline: fast elliptical lines for anisotropic texture mapping (1999) (67)
- Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers (2006) (63)
- Hybrid checkpointing using emerging nonvolatile memories for future exascale systems (2011) (62)
- Staged Reads: Mitigating the impact of DRAM writes on DRAM reads (2012) (60)
- Design of a high performance VLSI processor (1983) (58)
- Multi-Core Cache Hierarchies (2011) (56)
- System implications of memory reliability in exascale computing (2011) (56)
- Enterprise IT trends and implications for architecture research (2005) (54)
- Photonic Architectures for High-Performance Data Centers (2013) (53)
- Microprocessors in the Era of Terascale Integration (2007) (51)
- CACTI-IO: CACTI with off-chip power-area-timing models (2012) (45)
- Design of cross-point metal-oxide ReRAM emphasizing reliability and cost (2013) (45)
- A 20-MIPS sustained 32-bit CMOS microprocessor with high ratio of sustained to peak performance (1989) (43)
- Derivation of Signal Flow Direction in MOS VLSI (1987) (41)
- McPAT 1 . 0 : An Integrated Power , Area , and Timing Modeling Framework for Multicore Architectures ∗ (2010) (41)
- Organization and VLSI implementation of MIPS (1984) (41)
- A high-speed optical multi-drop bus for computer interconnections (2008) (40)
- Neon: a single-chip 3D workstation graphics accelerator (1998) (39)
- First steps towards mutually-immersive mobile telepresence (2002) (38)
- A Multi-Core Approach to Addressing the Energy-Complexity Problem in Microprocessors (2003) (35)
- Isolation in Commodity Multicore Processors (2007) (35)
- The Design Process for Google's Training Chips: TPUv2 and TPUv3 (2021) (34)
- Integration and packaging plateaus of processor performance (1989) (33)
- Practical nonvolatile multilevel-cell phase change memory (2013) (32)
- A 300-MHz 115-W 32-b bipolar ECL microprocessor (1993) (32)
- System-level integrated server architectures for scale-out datacenters (2011) (32)
- First steps towards mutually-immersive mobile telepresence (2002) (30)
- Improving System Energy Efficiency with Memory Rank Subsetting (2012) (30)
- MAGE: Adaptive Granularity and ECC for resilient and power efficient memory systems (2012) (28)
- CACTI 5.0 (2007) (27)
- Synthesis Lectures on Computer Architecture (2011) (27)
- Neon: A (Big) (Fast) Single-Chip 3D Workstation Graphics Accelerator (1999) (27)
- BiReality: mutually-immersive telepresence (2004) (27)
- The Potential Energy Efficiency of Vector Acceleration (2006) (26)
- Searching for Fast Model Families on Datacenter Accelerators (2021) (23)
- High-performance ethernet-based communications for future multi-core processors (2007) (23)
- Endurance-aware cache line management for non-volatile caches (2014) (21)
- Google's Training Chips Revealed: TPUv2 and TPUv3 (2020) (20)
- Telepresence Systems With Automatic Preservation of User Head Height, Local Rotation, and Remote Translation (2005) (19)
- Fast synchronization for chip multiprocessors (2005) (17)
- A High-Speed Optical Multidrop Bus for Computer Interconnections (2009) (16)
- Prefiltered antialiased lines using half-plane distance functions (2000) (16)
- The Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance (1999) (16)
- Processor Power Reduction Via Single-ISA (2003) (15)
- Circuit and Process Directions for Low-Voltage Swing Submicron BiCMOS (1999) (15)
- Retrospective: improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers (1998) (14)
- A 300 MHz 115 W 32 b bipolar ECL microprocessor with on-chip caches (1993) (14)
- A pipelined 32b NMOS microprocessor (1984) (13)
- The future evolution of high-performance microprocessors (2004) (13)
- The Multicluster Architecture: Reducing Processor Cycle Time Through Partitioning (1999) (13)
- Implementing high availability memory with a duplication cache (2008) (13)
- Implementing Neon: a 256-bit graphics accelerator (1999) (13)
- Enterprise Power and Cooling: a Chip-to-data Center Perspective (12)
- Evaluating the Potential of Future On-Chip Clock Distribution using Optical Interconnects (2007) (11)
- Techniques for Data Mapping and Buffering to Exploit Asymmetry in Multi-Level Cell (Phase Change) Memory (2013) (11)
- Optical high radix switch design (2012) (9)
- Impacts of Non-blocking Caches in Out-of-order Processors (2011) (9)
- NeuroMeter: An Integrated Power, Area, and Timing Modeling Framework for Machine Learning Accelerators Industry Track Paper (2021) (8)
- Motivating Commodity Multi-Core Processor Design for System-level Error Protection (8)
- A Case Study of Incremental and Background Hybrid In-Memory Checkpointing (2010) (8)
- DRAM errors in the wild (2011) (8)
- Dynamically configurable shared CMP helper engines for improved performance (2005) (7)
- The Role of Photonics in Future Datacenter Networks (2013) (7)
- Region of interest editing of MPEG-2 video streams in the compressed domain (2004) (7)
- Z 3 : an economical hardware technique for high-quality antialia (1999) (7)
- Mutually-Immersive Audio Telepresence (2002) (7)
- A speed, power, and supply noise evaluation of ECL driver circuits (1994) (6)
- Reducing Compulsory and Capacity Misses (1999) (5)
- Optical Interconnects for High-Performance Computing Systems (2012) (5)
- A fully-compensated APD circuit with 10:1 ratio between active and inactive current (1994) (5)
- Superscalar vs. superpipelined machines (1988) (5)
- Designing, packaging, and testing a 300-MHz, 115 W ECL microprocessor (1994) (5)
- Emerging technologies and their impact on system design (2009) (5)
- Optical interconnects for high-performance computing systems (2012) (4)
- System-wide performance monitors and their application to the optimization of coherent memory accesses (2005) (4)
- A 20 MIPS sustained 32 b CMOS microprocessor with 64 b data bus (1989) (4)
- A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies (2013) (4)
- CACTI-IO Technical Report (2012) (4)
- A first generation Mutually-Immersive Mobile Telepresence surrogate with automatic backtracking (2004) (4)
- Highly Available Data Parallel ML training on Mesh Networks (2020) (3)
- Reducing Overhead for Soft Error Coverage in High Availability Systems (3)
- Real products, real technology Guest Editor's Introduction] (1999) (3)
- History-Assisted Adaptive-Granularity Caches (HAAG$) for High Performance 3D DRAM Architectures (2015) (3)
- A circuit-architecture co-optimization framework for evaluating emerging memory hierarchies (2013) (3)
- Memory Modeling with CACTI (2010) (3)
- CMOS Nanophotonics: Technology, System Implications, and a CMP Case Study (2011) (2)
- System implications of integrated photonics (2008) (2)
- Free-p: A Practical End-to-End Nonvolatile Memory Protection Mechanism (2012) (2)
- Improving the performance and power efficiency of shared helpers in CMPs (2006) (2)
- A Headphone-Free Head-Tracked Audio Telepresence System (2004) (2)
- Performance issues in VLSI processor design (1983) (2)
- Common Bonds: MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor (2016) (2)
- Wear-Leveling Techniques for Nonvolatile Memories (2014) (1)
- Hyperscale Hardware Optimized Neural Architecture Search (2023) (1)
- Retrospective: Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers (1998) (1)
- Introduction to the special issue on the 2008 workshop on design, analysis, and simulation of chip multiprocessors (dasCMP'08) (2009) (1)
- Technical perspectiveSoftware and hardware support for deterministic replay of parallel programs (2009) (1)
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings (2023) (1)
- Resilience Challenges for Exascale Systems (2009) (1)
- Research Report 89 / 10 Integration and Packaging Plateaus of Processor Performance (1999) (0)
- Three-dimensional chip stack with an optical link between the present apparatus and within a device (2008) (0)
- 2.2 Voltage and Timing Margins............................. 16 (2012) (0)
- Session details: Cache organization (2009) (0)
- Introduction to the special issue on the 2007 workshop on design, analysis, and simulation of chip multiprocessors (dasCMP'07) (2008) (0)
- A High-speed Optical Multidrop Bus for Computer Interconnections Signal Integrity Constraints of High-speed Electronics Have Made Multidrop (2009) (0)
- Three-dimensional chip stack with an optical link exists between devices within a device (2008) (0)
- Case Studies and Exercises Case Study 1: Optimizing Cache Performance via Advanced Techniques Non-blocking Caches Compiler Optimizations for Caches Calculating Impact of Cache Performance on More Complex Processors Historical Perspective and References Case Studies and Exercises (2013) (0)
- Introduction to the special issue on the 2005 workshop on design, analysis, and simulation of chip multiprocessors (dasCMP'05) (2005) (0)
- Guest Editor's Introduction: Hot Chips and the Microprocessor (1996) (0)
- Spatial audio conferencing system (2010) (0)
- controlled by programmable logic power meter (1998) (0)
- Google ’ s TPU supercomputers train deep neural networks 50 x faster than general-purpose supercomputers running a high-performance computing benchmark (2020) (0)
- Introduction to the special issue on the 2006 workshop on design, analysis, and simulation of chip multiprocessors: (dasCMP'06) (2007) (0)
- Telescopic Spatial Radio (2004) (0)
- Appendix a Workload 64-entry Fa 128-entry Fa 256-entry Fa 128-entry 4-way Sa Table 6: Percent Reduction in User Tlb Misses with Common-mask Tlb (lru Replacement) 11 Conclusions (1995) (0)
- Core Core Core Core HThHTh Core HTh HTh Core HTh HTh Network on Chip Last Level Cache Memory Controller Memory (2013) (0)
- Method and Apparatus for composition of image colors by memory constraints (1998) (0)
- Session details: Special purpose to warehouse computers (2007) (0)
- Memory system for data processor (1991) (0)
- ower, and Supply Noise valuation of ECL Driver Circuits (1996) (0)
- Guest Editor's Introduction: Hot Chips III (1992) (0)
- Session details: Cache organization (2009) (0)
- Exploring Performance Limits to Future Instruction-Level-Parallel Processors (1998) (0)
- United States Patent 19 Hooper IIII (2017) (0)
- Isolation in Commodity Processors (2007) (0)
- Architecture - The potential energy efficiency of vector acceleration (2006) (0)
This paper list is powered by the following services:
Other Resources About Norman Jouppi
What Schools Are Affiliated With Norman Jouppi?
Norman Jouppi is affiliated with the following schools: