Jeffrey S. Vetter

Jeffrey S. Vetter's AcademicInfluence.com Rankings

Jeffrey S. Vetter

Computer Science

#9359

World Rank

#9832

Historical Rank

Parallel Computing

#49

World Rank

#51

Historical Rank

Database

#6326

World Rank

#6558

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Jeffrey S. Vetter's Degrees

PhD Computer Science Georgia Tech
Masters Computer Science Georgia Tech
Bachelors Computer Science University of Richmond

Similar Degrees You Can Earn

Why Is Jeffrey S. Vetter Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Jeffrey S. Vetter's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

The International Exascale Software Project roadmap (2011) (735)
The Scalable Heterogeneous Computing (SHOC) benchmark suite (2010) (631)
An Overview of the BlueGene/L Supercomputer (2002) (572)
A Survey of CPU-GPU Heterogeneous Computing Techniques (2015) (399)
Autopilot: adaptive control of distributed applications (1998) (307)
NVIDIA Tensor Core Programmability, Performance & Precision (2018) (242)
A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems (2016) (221)
Dynamic Software Testing of MPI Applications with Umpire (2000) (217)
Communication characteristics of large-scale scientific applications for contemporary cluster architectures (2002) (205)
Contemporary High Performance Computing - From Petascale toward Exascale (2019) (187)
A Survey of Methods for Analyzing and Improving GPU Energy Efficiency (2014) (172)
Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures (2010) (165)
Statistical scalability analysis of communication operations in distributed applications (2001) (149)
DESTINY: A tool for modeling emerging 3D NVM and eDRAM caches (2015) (134)
A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-Volatile On-Chip Caches (2015) (132)
The future of scientific workflows (2018) (132)
Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms (2010) (130)
Aspen: A domain specific language for performance modeling (2012) (124)
Characterization of Scientific Workloads on Systems with Multi-Core Processors (2006) (122)
Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community (2011) (118)
Classifying soft error vulnerabilities in extreme-Scale scientific applications using a binary instrumentation tool (2012) (115)
An annotated bibliography of interactive program steering (1994) (98)
Early evaluation of IBM BlueGene/P (2008) (96)
A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems (2016) (95)
Performance characterization and optimization of parallel I/O on the Cray XT (2008) (92)
Exploiting Lustre File Joining for Effective Collective IO (2007) (92)
Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing (2015) (89)
Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids (2000) (84)
High performance computational steering of physical simulations (1997) (84)
Using FPGA Devices to Accelerate Biomolecular Simulations (2007) (83)
An Empirical Performance Evaluation of Scalable Scientific Applications (2002) (81)
Contemporary High Performance Computing: From Petascale toward Exascale (2013) (81)
Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications (2012) (78)
Early evaluation of directive-based GPU programming models for productive exascale computing (2012) (78)
Maestro: Data Orchestration and Tuning for OpenCL Devices (2010) (76)
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations (2015) (76)
Early evaluation of the Cray XT3 (2006) (73)
Progress: A Toolkit for Interactive Program Steering (1995) (70)
A Survey of Techniques for Modeling and Improving Reliability of Computing Systems (2016) (70)
Scalable Analysis Techniques for Microprocessor Performance Counter Metrics (2002) (69)
OpenARC: open accelerator research compiler for directive-based, efficient heterogeneous computing (2014) (65)
COMPASS: A Framework for Automated Performance Modeling and Prediction (2015) (64)
Performance analysis of distributed applications using automatic classification of communication inefficiencies (1999) (60)
The tradeoffs of fused memory hierarchies in heterogeneous computing architectures (2012) (60)
Asserting Performance Expectations (2002) (59)
Performance evaluation of the Cray X1 distributed shared memory architecture (2004) (56)
Falcon: On‐line monitoring for steering parallel programs (1998) (56)
Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity (2018) (55)
From interactive applications to distributed laboratories (1998) (52)
Dynamic statistical profiling of communication activity in distributed applications (2002) (51)
Quantifying NUMA and contention effects in multi-GPU systems (2011) (51)
Exploring hybrid memory for GPU energy efficiency through software-hardware co-design (2013) (51)
OpenACC to FPGA: A Framework for Directive-Based High-Performance Reconfigurable Computing (2016) (51)
DESTINY: A Comprehensive Tool with 3D and Multi-Level Cell Memory Modeling Capability (2017) (49)
A Survey Of Techniques for Architecting DRAM Caches (2016) (47)
Rethinking algorithm-based fault tolerance with a cooperative software-hardware approach (2013) (47)
Investigating the TLB Behavior of High-end Scientific Applications on Commodity Microprocessors (2008) (47)
Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis (2005) (45)
A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications (2001) (45)
DARPA's HPCS Program- History, Models, Tools, Languages (2008) (43)
Falcon: On-line monitoring for steering parallel programs (1998) (41)
AYUSH: A Technique for Extending Lifetime of SRAM-NVM Hybrid Caches (2015) (40)
Accelerating S3D: A GPGPU Case Study (2009) (38)
Falcon: On-line Monitoring and Steering of Parallel Programs (1995) (38)
PapyrusKV: A High-Performance Parallel Key-Value Store for Distributed NVM Architectures (2017) (38)
FASE: A Framework for Scalable Performance Prediction of HPC Systems and Applications (2007) (38)
Accuracy and performance of graphics processors: A Quantum Monte Carlo application case study (2009) (37)
FlexiWay: A cache energy saving technique using fine-grained cache reconfiguration (2013) (37)
Algorithm-Directed Data Placement in Explicitly Managed Non-Volatile Memory (2016) (37)
Quantitatively Modeling Application Resilience with the Data Vulnerability Factor (2014) (36)
Managing Performance Analysis with Dynamic Statistical Projection Pursuit (2000) (36)
ParColl: Partitioned Collective I/O on the Cray XT (2008) (35)
Xen-Based HPC: A Parallel I/O Perspective (2008) (35)
PCM-Based Durable Write Cache for Fast Disk I/O (2012) (34)
Computational steering annotated bibliography (1997) (33)
PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows (2017) (32)
A framework to develop symbolic performance models of parallel applications (2006) (32)
OpenARC: Extensible OpenACC Compiler Framework for Directive-Based Accelerator Programming Study (2014) (31)
Performance characterization of molecular dynamics techniques for biomolecular simulations (2006) (31)
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access (2018) (30)
NVL-C: Static Analysis Techniques for Efficient, Correct Programming of Non-Volatile Main Memory Systems (2016) (30)
Analysis of a Computational Biology Simulation Technique on Emerging Processing Architectures (2007) (29)
LastingNVCache: A Technique for Improving the Lifetime of Non-volatile Caches (2014) (29)
Evaluating Performance Portability of OpenACC (2014) (29)
Architectures for the Post-Moore Era (2017) (28)
Wide-area performance profiling of 10GigE and InfiniBand technologies (2008) (26)
An Evaluation of the Oak Ridge National Laboratory Cray XT3 (2008) (26)
An Integrated Performance Visualizer for MPI/OpenMP Programs (2001) (25)
Models for computational steering (1996) (25)
WriteSmoothing: improving lifetime of non-volatile caches using intra-set wear-leveling (2014) (24)
On the Path to Exascale (2010) (23)
ECP Software Technology Capability Assessment Report (2018) (22)
Performance Analysis of Parallel Systems Approaches and Open Problems (2001) (22)
Performance evaluation of the SGI Altix 3700 (2005) (22)
GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method (2018) (21)
Opportunities and Tools for Highly Interactive Distributed and Parallel Computing (1994) (20)
Automated Characterization of Parallel Application Communication Patterns (2015) (19)
Siena: Exploring the Design Space of Heterogeneous Memory Systems (2018) (19)
Cray X1 Evaluation Status Report (2004) (19)
Highly Efficient Compensation-Based Parallelism for Wavefront Loops on GPUs (2018) (19)
EqualWrites: Reducing Intra-Set Write Variations for Enhancing Lifetime of Non-Volatile Caches (2016) (19)
EqualChance: Addressing Intra-set Write Variation to Increase Lifetime of Non-volatile Caches (2014) (18)
Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation (2017) (18)
Efficient Quality Threshold Clustering for Parallel Architectures (2012) (17)
Balancing productivity and performance on the cell broadband engine (2007) (17)
Diagnosis and optimization of application prefetching performance (2013) (17)
Toward an End-to-End Framework for Modeling, Monitoring and Anomaly Detection for Scientific Workflows (2016) (16)
Efficiency Evaluation of Cray XT Parallel IO Stack (2007) (15)
Empirical Analysis of a Large-Scale Hierarchical Storage System (2008) (15)
Impact of multicores on large-scale molecular dynamics simulations (2008) (15)
Juggler: a dependence-aware task-based execution framework for GPUs (2018) (15)
Techniques for high-performance computational steering (1999) (15)
Exploring Design Space of 3 D NVM and eDRAM Caches Using DESTINY Tool (2015) (15)
Application Characterization Using Oxbow Toolkit and PADS Infrastructure (2014) (14)
Addressing Read-Disturbance Issue in STT-RAM by Data Compression and Selective Duplication (2017) (14)
Quantifying Architectural Requirements of Contemporary Extreme-Scale Scientific Applications (2013) (14)
Enabling a highly-scalable global address space model for petascale computing (2010) (14)
Virtual Cluster Management with Xen (2007) (14)
CLACC: Translating OpenACC to OpenMP in Clang (2018) (14)
Performance of RDMA-capable storage protocols on wide-area network (2008) (14)
Examining recent many-core architectures and programming models using SHOC (2015) (13)
Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training (2018) (13)
A Holistic Approach for Performance Measurement and Analysis for Petascale Applications (2009) (13)
Experimental Analysis of InfiniBand Transport Services on WAN (2008) (13)
TensorFlow Doing HPC (2019) (13)
An Analysis of System Balance Requirements for Scientific Applications (2006) (12)
A Technique for Improving Lifetime of Non-Volatile Caches Using Write-Minimization (2016) (11)
Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs (2018) (11)
Performance evaluation of the Cray X1 distributed shared-memory architecture (2005) (11)
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing (2006) (11)
Reducing soft-error vulnerability of caches using data compression (2016) (11)
Performance evaluation of high-speed interconnects using dense communication patterns (2005) (11)
OPAL: An Open-Source MPI-IO Library over Cray XT (2007) (11)
Characterizing the performance benefit of hybrid memory system for HPC applications (2018) (11)
Contemporary High Performance Computing (2017) (11)
CCAMP: An Integrated Translation and optimization Framework for OpenACC and OpenMP (2020) (10)
Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing (2014) (10)
Hierarchical Model Validation of Symbolic Performance Models of Scientific Kernels (2006) (10)
Understanding Portability of a High-Level Programming Model on Contemporary Heterogeneous Architectures (2015) (10)
IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism (2016) (9)
Evaluation of UPC on the Cray X1 (2005) (9)
Improving energy efficiency of embedded DRAM caches for high-end computing systems (2014) (9)
Performance portability study for massively parallel computational fluid dynamics application on scalable heterogeneous architectures (2019) (9)
Reliability Tradeoffs in Design of Volatile and Nonvolatile Caches (2016) (9)
Characterizing the Impact of Prefetching on Scientific Application Performance (2013) (9)
Automated Design Space Exploration with Aspen (2015) (9)
HPC Interconnection Networks: The Key to Exascale Computing (2008) (8)
An Exploration of Performance Attributes for Symbolic Modeling of Emerging Processing Devices (2007) (8)
The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems (2020) (8)
An Evaluation of the ORNL Cray XT 3 (2006) (8)
BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis (2014) (7)
AYUSH: Extending Lifetime of SRAM-NVM Way-Based Hybrid Caches Using Wear-Leveling (2015) (7)
Designing Algorithms for the EMU Migrating-threads-based Architecture (2018) (7)
Language-Based Optimizations for Persistence on Nonvolatile Main Memory Systems (2017) (7)
Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File (2017) (7)
Design and Implementation of Papyrus: Parallel Aggregate Persistent Storage (2017) (7)
ORNL Cray X1 evaluation status report (2004) (7)
Quartile and Outlier Detection on Heterogeneous Clusters Using Distributed Radix Sort (2011) (6)
Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures (2011) (6)
Architecting SOT-RAM Based GPU Register File (2017) (6)
Sparse Matrix-Vector Multiplication Kernel on a Reconfigurable Computer (2005) (6)
Synthetic Program Analysis with Aspen (2015) (6)
Techniques for delayed binding of monitoring mechanisms to application-specific instrumentation points (1998) (6)
Performance Technologies for Peta-Scale Systems: A White Paper Prepared by the Performance Evaluation Research Center and Collaborators (2003) (6)
Design, implementation, and evaluation of transparent pNFS on Lustre (2009) (5)
An OpenACC-based unified programming model for multi-accelerator systems (2015) (5)
A Performance Measurement Infrastructure for Co-array Fortran (2005) (5)
IRIS: A Portable Runtime System Exploiting Multiple Heterogeneous Programming Systems (2021) (5)
GA-GPU: extending a library-based global address spaceprogramming model for scalable heterogeneouscomputing systems (2012) (5)
Kernel-level single system image for petascale computing (2006) (5)
Evaluating CUDA Portability with HIPCL and DPCT (2021) (5)
In-Depth Optimization with the OpenACC-to-FPGA Framework on an Arria 10 FPGA (2020) (5)
Evaluating high‐performance computers (2005) (5)
MEPHESTO: Modeling Energy-Performance in Heterogeneous SoCs and Their Trade-Offs (2020) (4)
BlackjackBench: portable hardware characterization (2011) (4)
OpenACC Profiling Support for Clang and LLVM using Clacc and TAU (2020) (4)
Intel Woodcrest: An Evaluation for Scientific Computing (2007) (4)
RXIO: Design and implementation of high performance RDMA-capable GridFTP (2012) (4)
Addressing Inter-set Write-Variation for Improving Lifetime of Non-Volatile Caches (2014) (4)
Revolutionary technologies for acceleration of emerging petascale applications (2009) (4)
Capturing Petascale Application Characteristics with the Sequoia Toolkit (2005) (4)
Performance characteristics of biomolecular simulations on high-end systems with multi-core processors (2008) (4)
Balancing FPGA Resource Utilities (2005) (4)
Experiences using Computational Steering on Existing Scienti c Applications (1999) (4)
Toward Performance Portable Programming for Heterogeneous Systems on a Chip: A Case Study with Qualcomm Snapdragon SoC (2021) (4)
Reimagining Codesign for Advanced Scientific Computing: Report for the ASCR Workshop on Reimaging Codesign (2021) (4)
A Methodology for Developing High Fidelity Communication Models for Large-Scale Applications Targeted on Multicore Systems (2008) (4)
Cooperative server clustering for a scalable GAS model on petascale cray XT5 systems (2010) (3)
Scalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks (2009) (3)
Glyphmaker: An Interactive, Programmerless Approach for Customizing Visual Data Representations (1993) (3)
Improving DRAM Bandwidth Utilization with MLP-Aware OS Paging (2016) (3)
Efficient Zero-Copy Noncontiguous I/O for Globus on InfiniBand (2010) (3)
Implementing efficient data compression and encryption in a persistent key-value store for HPC (2019) (3)
Performance Engineering: Understanding and Improving thePerformance of Large-Scale Codes (2007) (3)
On the road to Exascale: lessons from contemporary scalable GPU systems (2012) (3)
EXIO : Enabling globus on rdma networks-a case study with InfiniBand (2010) (3)
Deffe: a data-efficient framework for performance characterization in domain-specific computing (2020) (3)
OpenMP Target Task: Tasking and Target Offloading on Heterogeneous Systems (2021) (3)
Tuyere: enabling scalable memory workloads for system exploration (2018) (3)
Evaluation of the Cray XT3 at ORNL: a Status Report (2006) (3)
ASCR Report on a Quantum Computing Testbed for Science (2017) (3)
CCAMP: OpenMP and OpenACC Interoperable Framework (2019) (3)
Characterizing Applications on the Cray MTA-2 Multithreading Architecture (2006) (3)
An Application Specific Memory Characterization Technique for Co-processor Accelerators (2007) (3)
FLAME: Graph-based hardware representations for rapid and precise performance modeling (2019) (3)
Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm (2021) (3)
Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool (open-source code) (2015) (3)
Performance Issues in Parallel Processing Systems (2000) (3)
DOE Advanced Scientific Advisory Committee (ASCAC): Workforce Subcommittee Letter (2014) (3)
Understanding the Impact of Memory Access Patterns in Intel Processors (2020) (3)
BlackjackBench: portable hardware characterization (2012) (3)
Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5 (2011) (3)
Initial characterization of parallel NFS implementations (2010) (3)
Toward exascale computational science with heterogeneous processing (2010) (2)
Evaluating the Performance and Portability of Contemporary SYCL Implementations (2020) (2)
HiCOO: Hierarchical cooperation for scalable communication in Global Address Space programming models on Cray XT systems (2012) (2)
Blue Gene/P: JUGENE (2013) (2)
PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-Chips (2021) (2)
Preparing for the Future - Rethinking Proxy Apps (2022) (2)
A framework for performance analysis of Co‐Array Fortran (2007) (2)
LastingNVCache : Extending the Lifetime of Non-volatile Caches using Intra-set Wear-leveling (2014) (2)
Exascale Hardware Architectures Working Group (2011) (2)
Bridging HPC Communities through the Julia Programming Language (2022) (2)
Evaluating the Viability of Application-Driven Cooperative CPU/GPU Fault Detection (2013) (2)
Optimizations for language-directed computational steering (1999) (2)
A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory (2013) (2)
Static Graphs for Coding Productivity in OpenACC (2021) (2)
A High Performance Programming Model for Large-Scale Molecular Dynamics Calculations on Reconfigurable Supercomputers (2005) (2)
Modeling synthetic aperture radar computation with Aspen (2013) (2)
Toward Evaluating High-Level Synthesis Portability and Performance between Intel and Xilinx FPGAs (2021) (2)
Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems (2011) (2)
FITL: extending LLVM for the translation of fault-injection directives (2015) (2)
Aspen-based performance and energy modeling frameworks (2017) (2)
Performance evaluation of the cray XT3 configured with dual core opteron processors (2007) (2)
Enhancing Monte Carlo proxy applications on GPUs (2019) (2)
Can PCM Benefit GPU? Reconciling Hybrid Memory Design with GPU Massive Parallelism for Energy Efficiency (2013) (2)
A Study of Power-Performance Modeling Using a Domain-Specific Language (2016) (2)
IRIS-BLAS: Towards a Performance Portable and Heterogeneous BLAS Library (2022) (1)
Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs (2021) (1)
Runtime Techniques to Enable a Highly-Scalable Global Address Space Model for Petascale Computing (2012) (1)
TensorFlow Doing HPC An Evaluation of TensorFlow Performance in HPC Applications (2019) (1)
Performance portability study of epistasis detection using SYCL on NVIDIA GPU (2022) (1)
Computational Steering (1998) (1)
Experiences with Computational Steering on Existing Scientific Applications (1999) (1)
Hardware Evaluation Analytical Modeling and Node Simulation: Benefits of Tighter GPU Integration (2021) (1)
Throughput Improvement of Molecular Dynamics Simulations Using Reconfigurable Computing (2001) (1)
Early Evaluation of the Cray XT 3 at ORNL (2005) (1)
A Hierarchical Task Scheduler for Heterogeneous Computing (2021) (1)
Performance Technolgies for Peta-Scale Systems: A White Paper Prepared by the Performance Evaluation Research Center (2003) (1)
Performance Portability in Extreme Scale Computing (Dagstuhl Seminar 17431) (2017) (1)
neCODEC: nearline data compression for scientific applications (2014) (1)
Preparing for extreme heterogeneity in high performance computing (2019) (1)
Performance Metrics for High End Computing (2003) (1)
Moving Heterogeneous GPU Computing into the Mainstream with Directive-Based , High-Level Programming Models ( Position Paper ) (2012) (1)
Analyzing the suitability of contemporary 3D-stacked PIM architectures for HPC scientific applications (2019) (1)
Asserting Performance Expectations (Formerly Performance Assertions: A Performance Diagnosis Tool) (2002) (1)
SM-centric transformation: Circumventing hardware restrictions for flexible GPU scheduling (2014) (1)
A framework for performance analysis of Co-Array Fortran: Research Articles (2007) (1)
Juggler (2018) (1)
KokkACC: Enhancing Kokkos with OpenACC (2022) (1)
Memphis on an XT5: Pinpointing Memory Performance Problems on Cray Platforms (2011) (1)
Exploring Emerging Technologies in the HPC Co-Design Space (2014) (1)
Cash: A Single-Source Hardware-Software Codesign Framework for Rapid Prototyping (2020) (1)
Distributed workflows for modeling experimental data (2017) (1)
Prometheus: Coherent Exploration of Hardware and Software Optimizations Using Aspen (2018) (1)
Virtual Neuron: A Neuromorphic Approach for Encoding Numbers (2022) (0)
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2015) (0)
MTAAP Introduction (2011) (0)
Special Issue: Selected Papers from Super Computing 2012 (2013) (0)
Evaluating high-performance computers: Research Articles (2005) (0)
Sensitivity Analysis of Biomolecular Simulations using Symbolic Models (2007) (0)
Two Algorithms for Sorting On Heterogeneous Clusters (2012) (0)
MTAAP 2010 Welcome (2010) (0)
Integer Sum Reduction with OpenMP on an AMD MI100 GPU (2022) (0)
A Memory Efficient Lock-Free Circular Queue (2021) (0)
Evaluating the Performance of Integer Sum Reduction on an Intel GPU (2021) (0)
Optimization with the OpenACC-to-FPGA framework on the Arria 10 and Stratix 10 FPGAs (2021) (0)
Cray User Group 2011 Proceedings 1 of 8 Memphis on an XT 5 : Pinpointing Memory Performance Problems on Cray Platforms (2011) (0)
Improving Programmer Productivity on Heterogeneous GPU Computing Systems by Broadening and Strengthening the Tools Ecosystem (2012) (0)
Terascale to Petascale: The Past 17 Years in High Performance Computing (2017) (0)
Towards Enhancing Coding Productivity for GPU Programming Using Static Graphs (2022) (0)
Deffe (2020) (0)
Productive Hardware Designs using Hybrid HLS-RTL Development (2020) (0)
Advanced Application Support for Improved GPU Utilization on Keeneland (2014) (0)
Position Papers for the ASCR Workshop on Reimagining Codesign (2021) (0)
Leveraging Compiler-Based Translation to Evaluate a Diversity of Exascale Platforms (2022) (0)
Addressing Complex Memory for Exascale Systems and Applications (2018) (0)
Session details: Matrix product for special platforms (2008) (0)
Topic 2: Performance Evaluation (2004) (0)
Tuyere (2018) (0)
Understanding Performance Portability of Bioinformatics Applications in SYCL on an NVIDIA GPU (2022) (0)
Programming Systems on the Road to Exascale Computing (2012) (0)
Programming the EMU Architecture : Algorithm Design Considerations for Migratory-threads-based Systems (2018) (0)
Workshop on Modeling & Simulation of Systems and Applications (2014) (0)
SparseLU, A Novel Algorithm and Math Library for Sparse LU Factorization (2022) (0)
neCODEC: nearline data compression for scientific applications (2013) (0)
Chapter 7 Keeneland : Computational Science Using Heterogeneous GPU Computing (2013) (0)
Topic 2: Performance Prediction and Evaluation (2005) (0)
Evaluating Nonuniform Reduction in HIP and SYCL on GPUs (2022) (0)
2014 First Workshop on Accelerator Programming using Directives WACCPD 2014 Table of Contents (2014) (0)
AsHES Keynote (2014) (0)
LaRIS: Targeting Portability and Productivity for LAPACK Codes on Extreme Heterogeneous Systems by Using IRIS (2022) (0)
High-performance computing: Successes, failures, and future directions (2005) (0)
Design and analysis of CXL performance models for tightly-coupled heterogeneous computing (2022) (0)
A Portable and Heterogeneous LU Factorization on IRIS (2022) (0)
Local discovery of system architecture - application parameter sensitivity: an empirical technique for adaptive grid applications (2002) (0)
Computational Challenges in Nuclear Weapons Simulation (2003) (0)
MEPHESTO (2020) (0)
MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs (2022) (0)
Evaluating Unified Memory Performance in HIP (2022) (0)
Preparing for Supercomputing's Sixth Wave (2016) (0)
Preparing for the Future - Rethinking Proxy Applications (2022) (0)
High Performance Adaptive Physics Refinement to Enable Large-Scale Tracking of Cancer Cell Trajectory (2022) (0)
Keeneland: Computational Science Using Heterogeneous GPU Computing (2017) (0)
Development of a parallel spectral element code using SPMD constructs (1996) (0)
Enabling OpenACC programming on Multi-hybrid Accelerated with GPU and FPGA (2019) (0)
Workshop on Modeling and Simulation of Systems and Applications, August 13-14, 2014, University of Washington, Seattle (2014) (0)
2015 Salishan Final Program (2015) (0)
Ultra Low Latency Machine Learning for Scientific Edge Applications (2022) (0)
Combining Aspen with Massively Parallel Simulation for Effective Exascale Co-Design (2018) (0)
Comparing LLC-Memory Traffic between CPU and GPU Architectures (2021) (0)
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes (2023) (0)
Techniques and optimizations for high performance computational steering (1998) (0)
Glyphmaker: An Interactive, Programmerless Approach for Customizing, Exploring, and Analyzing Visual Data Representations (1993) (0)
A Dynamic MPI Software Correctness Checking Tool (2005) (0)
Flacc: Towards OpenACC support for Fortran in the LLVM Ecosystem (2021) (0)
Modeling the Office of Science Ten Year Facilities Plan: The PERI Architecture Team (2009) (0)
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU (2022) (0)
POSTER: Tango: An Optimizing Compiler for Just-In-Time RTL Simulation (2019) (0)
Encoding Integers and Rationals on Neuromorphic Computers using Virtual Neuron (2022) (0)
Performance and Scalability Analysis of Cray X1 Vectorization and Multistreaming Optimization (2005) (0)
A survey on processing-in-memory techniques: Advances and challenges (2022) (0)
Towards Exascale System : An Automatic Hardware Software Co-design Framework for Current and Future Architectures in Aspen (2016) (0)
Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, San Jose, CA, USA, June 5, 2011 (2011) (0)
Adrastea: An Efficient FPGA Design Environment for Heterogeneous Scientific Computing and Machine Learning (2022) (0)
Performance and Communication Modeling for Exascale Proxy Architecture in Aspen (2018) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Jeffrey S. Vetter?

Jeffrey S. Vetter is affiliated with the following schools:

New Jersey Institute of Technology

Jeffrey S. Vetter's Academic­Influence.com Rankings

Jeffrey S. Vetter's Degrees

Similar Degrees You Can Earn

Why Is Jeffrey S. Vetter Influential?

Jeffrey S. Vetter's Published Works

Published Works

What Schools Are Affiliated With Jeffrey S. Vetter?

Jeffrey S. Vetter's AcademicInfluence.com Rankings