Jeffrey S. Vetter
#161,716
Most Influential Person Now
Jeffrey S. Vetter's AcademicInfluence.com Rankings
Jeffrey S. Vettercomputer-science Degrees
Computer Science
#9359
World Rank
#9832
Historical Rank
Parallel Computing
#49
World Rank
#51
Historical Rank
Database
#6326
World Rank
#6558
Historical Rank

Download Badge
Computer Science
Jeffrey S. Vetter's Degrees
- PhD Computer Science Georgia Tech
- Masters Computer Science Georgia Tech
- Bachelors Computer Science University of Richmond
Similar Degrees You Can Earn
Why Is Jeffrey S. Vetter Influential?
(Suggest an Edit or Addition)Jeffrey S. Vetter's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- The International Exascale Software Project roadmap (2011) (735)
- The Scalable Heterogeneous Computing (SHOC) benchmark suite (2010) (631)
- An Overview of the BlueGene/L Supercomputer (2002) (572)
- A Survey of CPU-GPU Heterogeneous Computing Techniques (2015) (399)
- Autopilot: adaptive control of distributed applications (1998) (307)
- NVIDIA Tensor Core Programmability, Performance & Precision (2018) (242)
- A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems (2016) (221)
- Dynamic Software Testing of MPI Applications with Umpire (2000) (217)
- Communication characteristics of large-scale scientific applications for contemporary cluster architectures (2002) (205)
- Contemporary High Performance Computing - From Petascale toward Exascale (2019) (187)
- A Survey of Methods for Analyzing and Improving GPU Energy Efficiency (2014) (172)
- Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures (2010) (165)
- Statistical scalability analysis of communication operations in distributed applications (2001) (149)
- DESTINY: A tool for modeling emerging 3D NVM and eDRAM caches (2015) (134)
- A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-Volatile On-Chip Caches (2015) (132)
- The future of scientific workflows (2018) (132)
- Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms (2010) (130)
- Aspen: A domain specific language for performance modeling (2012) (124)
- Characterization of Scientific Workloads on Systems with Multi-Core Processors (2006) (122)
- Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community (2011) (118)
- Classifying soft error vulnerabilities in extreme-Scale scientific applications using a binary instrumentation tool (2012) (115)
- An annotated bibliography of interactive program steering (1994) (98)
- Early evaluation of IBM BlueGene/P (2008) (96)
- A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems (2016) (95)
- Performance characterization and optimization of parallel I/O on the Cray XT (2008) (92)
- Exploiting Lustre File Joining for Effective Collective IO (2007) (92)
- Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing (2015) (89)
- Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids (2000) (84)
- High performance computational steering of physical simulations (1997) (84)
- Using FPGA Devices to Accelerate Biomolecular Simulations (2007) (83)
- An Empirical Performance Evaluation of Scalable Scientific Applications (2002) (81)
- Contemporary High Performance Computing: From Petascale toward Exascale (2013) (81)
- Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications (2012) (78)
- Early evaluation of directive-based GPU programming models for productive exascale computing (2012) (78)
- Maestro: Data Orchestration and Tuning for OpenCL Devices (2010) (76)
- Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations (2015) (76)
- Early evaluation of the Cray XT3 (2006) (73)
- Progress: A Toolkit for Interactive Program Steering (1995) (70)
- A Survey of Techniques for Modeling and Improving Reliability of Computing Systems (2016) (70)
- Scalable Analysis Techniques for Microprocessor Performance Counter Metrics (2002) (69)
- OpenARC: open accelerator research compiler for directive-based, efficient heterogeneous computing (2014) (65)
- COMPASS: A Framework for Automated Performance Modeling and Prediction (2015) (64)
- Performance analysis of distributed applications using automatic classification of communication inefficiencies (1999) (60)
- The tradeoffs of fused memory hierarchies in heterogeneous computing architectures (2012) (60)
- Asserting Performance Expectations (2002) (59)
- Performance evaluation of the Cray X1 distributed shared memory architecture (2004) (56)
- Falcon: On‐line monitoring for steering parallel programs (1998) (56)
- Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity (2018) (55)
- From interactive applications to distributed laboratories (1998) (52)
- Dynamic statistical profiling of communication activity in distributed applications (2002) (51)
- Quantifying NUMA and contention effects in multi-GPU systems (2011) (51)
- Exploring hybrid memory for GPU energy efficiency through software-hardware co-design (2013) (51)
- OpenACC to FPGA: A Framework for Directive-Based High-Performance Reconfigurable Computing (2016) (51)
- DESTINY: A Comprehensive Tool with 3D and Multi-Level Cell Memory Modeling Capability (2017) (49)
- A Survey Of Techniques for Architecting DRAM Caches (2016) (47)
- Rethinking algorithm-based fault tolerance with a cooperative software-hardware approach (2013) (47)
- Investigating the TLB Behavior of High-end Scientific Applications on Commodity Microprocessors (2008) (47)
- Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis (2005) (45)
- A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications (2001) (45)
- DARPA's HPCS Program- History, Models, Tools, Languages (2008) (43)
- Falcon: On-line monitoring for steering parallel programs (1998) (41)
- AYUSH: A Technique for Extending Lifetime of SRAM-NVM Hybrid Caches (2015) (40)
- Accelerating S3D: A GPGPU Case Study (2009) (38)
- Falcon: On-line Monitoring and Steering of Parallel Programs (1995) (38)
- PapyrusKV: A High-Performance Parallel Key-Value Store for Distributed NVM Architectures (2017) (38)
- FASE: A Framework for Scalable Performance Prediction of HPC Systems and Applications (2007) (38)
- Accuracy and performance of graphics processors: A Quantum Monte Carlo application case study (2009) (37)
- FlexiWay: A cache energy saving technique using fine-grained cache reconfiguration (2013) (37)
- Algorithm-Directed Data Placement in Explicitly Managed Non-Volatile Memory (2016) (37)
- Quantitatively Modeling Application Resilience with the Data Vulnerability Factor (2014) (36)
- Managing Performance Analysis with Dynamic Statistical Projection Pursuit (2000) (36)
- ParColl: Partitioned Collective I/O on the Cray XT (2008) (35)
- Xen-Based HPC: A Parallel I/O Perspective (2008) (35)
- PCM-Based Durable Write Cache for Fast Disk I/O (2012) (34)
- Computational steering annotated bibliography (1997) (33)
- PANORAMA: An approach to performance modeling and diagnosis of extreme-scale workflows (2017) (32)
- A framework to develop symbolic performance models of parallel applications (2006) (32)
- OpenARC: Extensible OpenACC Compiler Framework for Directive-Based Accelerator Programming Study (2014) (31)
- Performance characterization of molecular dynamics techniques for biomolecular simulations (2006) (31)
- DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access (2018) (30)
- NVL-C: Static Analysis Techniques for Efficient, Correct Programming of Non-Volatile Main Memory Systems (2016) (30)
- Analysis of a Computational Biology Simulation Technique on Emerging Processing Architectures (2007) (29)
- LastingNVCache: A Technique for Improving the Lifetime of Non-volatile Caches (2014) (29)
- Evaluating Performance Portability of OpenACC (2014) (29)
- Architectures for the Post-Moore Era (2017) (28)
- Wide-area performance profiling of 10GigE and InfiniBand technologies (2008) (26)
- An Evaluation of the Oak Ridge National Laboratory Cray XT3 (2008) (26)
- An Integrated Performance Visualizer for MPI/OpenMP Programs (2001) (25)
- Models for computational steering (1996) (25)
- WriteSmoothing: improving lifetime of non-volatile caches using intra-set wear-leveling (2014) (24)
- On the Path to Exascale (2010) (23)
- ECP Software Technology Capability Assessment Report (2018) (22)
- Performance Analysis of Parallel Systems Approaches and Open Problems (2001) (22)
- Performance evaluation of the SGI Altix 3700 (2005) (22)
- GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method (2018) (21)
- Opportunities and Tools for Highly Interactive Distributed and Parallel Computing (1994) (20)
- Automated Characterization of Parallel Application Communication Patterns (2015) (19)
- Siena: Exploring the Design Space of Heterogeneous Memory Systems (2018) (19)
- Cray X1 Evaluation Status Report (2004) (19)
- Highly Efficient Compensation-Based Parallelism for Wavefront Loops on GPUs (2018) (19)
- EqualWrites: Reducing Intra-Set Write Variations for Enhancing Lifetime of Non-Volatile Caches (2016) (19)
- EqualChance: Addressing Intra-set Write Variation to Increase Lifetime of Non-volatile Caches (2014) (18)
- Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation (2017) (18)
- Efficient Quality Threshold Clustering for Parallel Architectures (2012) (17)
- Balancing productivity and performance on the cell broadband engine (2007) (17)
- Diagnosis and optimization of application prefetching performance (2013) (17)
- Toward an End-to-End Framework for Modeling, Monitoring and Anomaly Detection for Scientific Workflows (2016) (16)
- Efficiency Evaluation of Cray XT Parallel IO Stack (2007) (15)
- Empirical Analysis of a Large-Scale Hierarchical Storage System (2008) (15)
- Impact of multicores on large-scale molecular dynamics simulations (2008) (15)
- Juggler: a dependence-aware task-based execution framework for GPUs (2018) (15)
- Techniques for high-performance computational steering (1999) (15)
- Exploring Design Space of 3 D NVM and eDRAM Caches Using DESTINY Tool (2015) (15)
- Application Characterization Using Oxbow Toolkit and PADS Infrastructure (2014) (14)
- Addressing Read-Disturbance Issue in STT-RAM by Data Compression and Selective Duplication (2017) (14)
- Quantifying Architectural Requirements of Contemporary Extreme-Scale Scientific Applications (2013) (14)
- Enabling a highly-scalable global address space model for petascale computing (2010) (14)
- Virtual Cluster Management with Xen (2007) (14)
- CLACC: Translating OpenACC to OpenMP in Clang (2018) (14)
- Performance of RDMA-capable storage protocols on wide-area network (2008) (14)
- Examining recent many-core architectures and programming models using SHOC (2015) (13)
- Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training (2018) (13)
- A Holistic Approach for Performance Measurement and Analysis for Petascale Applications (2009) (13)
- Experimental Analysis of InfiniBand Transport Services on WAN (2008) (13)
- TensorFlow Doing HPC (2019) (13)
- An Analysis of System Balance Requirements for Scientific Applications (2006) (12)
- A Technique for Improving Lifetime of Non-Volatile Caches Using Write-Minimization (2016) (11)
- Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs (2018) (11)
- Performance evaluation of the Cray X1 distributed shared-memory architecture (2005) (11)
- IPMI-based Efficient Notification Framework for Large Scale Cluster Computing (2006) (11)
- Reducing soft-error vulnerability of caches using data compression (2016) (11)
- Performance evaluation of high-speed interconnects using dense communication patterns (2005) (11)
- OPAL: An Open-Source MPI-IO Library over Cray XT (2007) (11)
- Characterizing the performance benefit of hybrid memory system for HPC applications (2018) (11)
- Contemporary High Performance Computing (2017) (11)
- CCAMP: An Integrated Translation and optimization Framework for OpenACC and OpenMP (2020) (10)
- Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing (2014) (10)
- Hierarchical Model Validation of Symbolic Performance Models of Scientific Kernels (2006) (10)
- Understanding Portability of a High-Level Programming Model on Contemporary Heterogeneous Architectures (2015) (10)
- IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism (2016) (9)
- Evaluation of UPC on the Cray X1 (2005) (9)
- Improving energy efficiency of embedded DRAM caches for high-end computing systems (2014) (9)
- Performance portability study for massively parallel computational fluid dynamics application on scalable heterogeneous architectures (2019) (9)
- Reliability Tradeoffs in Design of Volatile and Nonvolatile Caches (2016) (9)
- Characterizing the Impact of Prefetching on Scientific Application Performance (2013) (9)
- Automated Design Space Exploration with Aspen (2015) (9)
- HPC Interconnection Networks: The Key to Exascale Computing (2008) (8)
- An Exploration of Performance Attributes for Symbolic Modeling of Emerging Processing Devices (2007) (8)
- The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems (2020) (8)
- An Evaluation of the ORNL Cray XT 3 (2006) (8)
- BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis (2014) (7)
- AYUSH: Extending Lifetime of SRAM-NVM Way-Based Hybrid Caches Using Wear-Leveling (2015) (7)
- Designing Algorithms for the EMU Migrating-threads-based Architecture (2018) (7)
- Language-Based Optimizations for Persistence on Nonvolatile Main Memory Systems (2017) (7)
- Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File (2017) (7)
- Design and Implementation of Papyrus: Parallel Aggregate Persistent Storage (2017) (7)
- ORNL Cray X1 evaluation status report (2004) (7)
- Quartile and Outlier Detection on Heterogeneous Clusters Using Distributed Radix Sort (2011) (6)
- Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures (2011) (6)
- Architecting SOT-RAM Based GPU Register File (2017) (6)
- Sparse Matrix-Vector Multiplication Kernel on a Reconfigurable Computer (2005) (6)
- Synthetic Program Analysis with Aspen (2015) (6)
- Techniques for delayed binding of monitoring mechanisms to application-specific instrumentation points (1998) (6)
- Performance Technologies for Peta-Scale Systems: A White Paper Prepared by the Performance Evaluation Research Center and Collaborators (2003) (6)
- Design, implementation, and evaluation of transparent pNFS on Lustre (2009) (5)
- An OpenACC-based unified programming model for multi-accelerator systems (2015) (5)
- A Performance Measurement Infrastructure for Co-array Fortran (2005) (5)
- IRIS: A Portable Runtime System Exploiting Multiple Heterogeneous Programming Systems (2021) (5)
- GA-GPU: extending a library-based global address spaceprogramming model for scalable heterogeneouscomputing systems (2012) (5)
- Kernel-level single system image for petascale computing (2006) (5)
- Evaluating CUDA Portability with HIPCL and DPCT (2021) (5)
- In-Depth Optimization with the OpenACC-to-FPGA Framework on an Arria 10 FPGA (2020) (5)
- Evaluating high‐performance computers (2005) (5)
- MEPHESTO: Modeling Energy-Performance in Heterogeneous SoCs and Their Trade-Offs (2020) (4)
- BlackjackBench: portable hardware characterization (2011) (4)
- OpenACC Profiling Support for Clang and LLVM using Clacc and TAU (2020) (4)
- Intel Woodcrest: An Evaluation for Scientific Computing (2007) (4)
- RXIO: Design and implementation of high performance RDMA-capable GridFTP (2012) (4)
- Addressing Inter-set Write-Variation for Improving Lifetime of Non-Volatile Caches (2014) (4)
- Revolutionary technologies for acceleration of emerging petascale applications (2009) (4)
- Capturing Petascale Application Characteristics with the Sequoia Toolkit (2005) (4)
- Performance characteristics of biomolecular simulations on high-end systems with multi-core processors (2008) (4)
- Balancing FPGA Resource Utilities (2005) (4)
- Experiences using Computational Steering on Existing Scienti c Applications (1999) (4)
- Toward Performance Portable Programming for Heterogeneous Systems on a Chip: A Case Study with Qualcomm Snapdragon SoC (2021) (4)
- Reimagining Codesign for Advanced Scientific Computing: Report for the ASCR Workshop on Reimaging Codesign (2021) (4)
- A Methodology for Developing High Fidelity Communication Models for Large-Scale Applications Targeted on Multicore Systems (2008) (4)
- Cooperative server clustering for a scalable GAS model on petascale cray XT5 systems (2010) (3)
- Scalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks (2009) (3)
- Glyphmaker: An Interactive, Programmerless Approach for Customizing Visual Data Representations (1993) (3)
- Improving DRAM Bandwidth Utilization with MLP-Aware OS Paging (2016) (3)
- Efficient Zero-Copy Noncontiguous I/O for Globus on InfiniBand (2010) (3)
- Implementing efficient data compression and encryption in a persistent key-value store for HPC (2019) (3)
- Performance Engineering: Understanding and Improving thePerformance of Large-Scale Codes (2007) (3)
- On the road to Exascale: lessons from contemporary scalable GPU systems (2012) (3)
- EXIO : Enabling globus on rdma networks-a case study with InfiniBand (2010) (3)
- Deffe: a data-efficient framework for performance characterization in domain-specific computing (2020) (3)
- OpenMP Target Task: Tasking and Target Offloading on Heterogeneous Systems (2021) (3)
- Tuyere: enabling scalable memory workloads for system exploration (2018) (3)
- Evaluation of the Cray XT3 at ORNL: a Status Report (2006) (3)
- ASCR Report on a Quantum Computing Testbed for Science (2017) (3)
- CCAMP: OpenMP and OpenACC Interoperable Framework (2019) (3)
- Characterizing Applications on the Cray MTA-2 Multithreading Architecture (2006) (3)
- An Application Specific Memory Characterization Technique for Co-processor Accelerators (2007) (3)
- FLAME: Graph-based hardware representations for rapid and precise performance modeling (2019) (3)
- Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm (2021) (3)
- Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool (open-source code) (2015) (3)
- Performance Issues in Parallel Processing Systems (2000) (3)
- DOE Advanced Scientific Advisory Committee (ASCAC): Workforce Subcommittee Letter (2014) (3)
- Understanding the Impact of Memory Access Patterns in Intel Processors (2020) (3)
- BlackjackBench: portable hardware characterization (2012) (3)
- Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5 (2011) (3)
- Initial characterization of parallel NFS implementations (2010) (3)
- Toward exascale computational science with heterogeneous processing (2010) (2)
- Evaluating the Performance and Portability of Contemporary SYCL Implementations (2020) (2)
- HiCOO: Hierarchical cooperation for scalable communication in Global Address Space programming models on Cray XT systems (2012) (2)
- Blue Gene/P: JUGENE (2013) (2)
- PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-Chips (2021) (2)
- Preparing for the Future - Rethinking Proxy Apps (2022) (2)
- A framework for performance analysis of Co‐Array Fortran (2007) (2)
- LastingNVCache : Extending the Lifetime of Non-volatile Caches using Intra-set Wear-leveling (2014) (2)
- Exascale Hardware Architectures Working Group (2011) (2)
- Bridging HPC Communities through the Julia Programming Language (2022) (2)
- Evaluating the Viability of Application-Driven Cooperative CPU/GPU Fault Detection (2013) (2)
- Optimizations for language-directed computational steering (1999) (2)
- A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory (2013) (2)
- Static Graphs for Coding Productivity in OpenACC (2021) (2)
- A High Performance Programming Model for Large-Scale Molecular Dynamics Calculations on Reconfigurable Supercomputers (2005) (2)
- Modeling synthetic aperture radar computation with Aspen (2013) (2)
- Toward Evaluating High-Level Synthesis Portability and Performance between Intel and Xilinx FPGAs (2021) (2)
- Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems (2011) (2)
- FITL: extending LLVM for the translation of fault-injection directives (2015) (2)
- Aspen-based performance and energy modeling frameworks (2017) (2)
- Performance evaluation of the cray XT3 configured with dual core opteron processors (2007) (2)
- Enhancing Monte Carlo proxy applications on GPUs (2019) (2)
- Can PCM Benefit GPU? Reconciling Hybrid Memory Design with GPU Massive Parallelism for Energy Efficiency (2013) (2)
- A Study of Power-Performance Modeling Using a Domain-Specific Language (2016) (2)
- IRIS-BLAS: Towards a Performance Portable and Heterogeneous BLAS Library (2022) (1)
- Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs (2021) (1)
- Runtime Techniques to Enable a Highly-Scalable Global Address Space Model for Petascale Computing (2012) (1)
- TensorFlow Doing HPC An Evaluation of TensorFlow Performance in HPC Applications (2019) (1)
- Performance portability study of epistasis detection using SYCL on NVIDIA GPU (2022) (1)
- Computational Steering (1998) (1)
- Experiences with Computational Steering on Existing Scientific Applications (1999) (1)
- Hardware Evaluation Analytical Modeling and Node Simulation: Benefits of Tighter GPU Integration (2021) (1)
- Throughput Improvement of Molecular Dynamics Simulations Using Reconfigurable Computing (2001) (1)
- Early Evaluation of the Cray XT 3 at ORNL (2005) (1)
- A Hierarchical Task Scheduler for Heterogeneous Computing (2021) (1)
- Performance Technolgies for Peta-Scale Systems: A White Paper Prepared by the Performance Evaluation Research Center (2003) (1)
- Performance Portability in Extreme Scale Computing (Dagstuhl Seminar 17431) (2017) (1)
- neCODEC: nearline data compression for scientific applications (2014) (1)
- Preparing for extreme heterogeneity in high performance computing (2019) (1)
- Performance Metrics for High End Computing (2003) (1)
- Moving Heterogeneous GPU Computing into the Mainstream with Directive-Based , High-Level Programming Models ( Position Paper ) (2012) (1)
- Analyzing the suitability of contemporary 3D-stacked PIM architectures for HPC scientific applications (2019) (1)
- Asserting Performance Expectations (Formerly Performance Assertions: A Performance Diagnosis Tool) (2002) (1)
- SM-centric transformation: Circumventing hardware restrictions for flexible GPU scheduling (2014) (1)
- A framework for performance analysis of Co-Array Fortran: Research Articles (2007) (1)
- Juggler (2018) (1)
- KokkACC: Enhancing Kokkos with OpenACC (2022) (1)
- Memphis on an XT5: Pinpointing Memory Performance Problems on Cray Platforms (2011) (1)
- Exploring Emerging Technologies in the HPC Co-Design Space (2014) (1)
- Cash: A Single-Source Hardware-Software Codesign Framework for Rapid Prototyping (2020) (1)
- Distributed workflows for modeling experimental data (2017) (1)
- Prometheus: Coherent Exploration of Hardware and Software Optimizations Using Aspen (2018) (1)
- Virtual Neuron: A Neuromorphic Approach for Encoding Numbers (2022) (0)
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2015) (0)
- MTAAP Introduction (2011) (0)
- Special Issue: Selected Papers from Super Computing 2012 (2013) (0)
- Evaluating high-performance computers: Research Articles (2005) (0)
- Sensitivity Analysis of Biomolecular Simulations using Symbolic Models (2007) (0)
- Two Algorithms for Sorting On Heterogeneous Clusters (2012) (0)
- MTAAP 2010 Welcome (2010) (0)
- Integer Sum Reduction with OpenMP on an AMD MI100 GPU (2022) (0)
- A Memory Efficient Lock-Free Circular Queue (2021) (0)
- Evaluating the Performance of Integer Sum Reduction on an Intel GPU (2021) (0)
- Optimization with the OpenACC-to-FPGA framework on the Arria 10 and Stratix 10 FPGAs (2021) (0)
- Cray User Group 2011 Proceedings 1 of 8 Memphis on an XT 5 : Pinpointing Memory Performance Problems on Cray Platforms (2011) (0)
- Improving Programmer Productivity on Heterogeneous GPU Computing Systems by Broadening and Strengthening the Tools Ecosystem (2012) (0)
- Terascale to Petascale: The Past 17 Years in High Performance Computing (2017) (0)
- Towards Enhancing Coding Productivity for GPU Programming Using Static Graphs (2022) (0)
- Deffe (2020) (0)
- Productive Hardware Designs using Hybrid HLS-RTL Development (2020) (0)
- Advanced Application Support for Improved GPU Utilization on Keeneland (2014) (0)
- Position Papers for the ASCR Workshop on Reimagining Codesign (2021) (0)
- Leveraging Compiler-Based Translation to Evaluate a Diversity of Exascale Platforms (2022) (0)
- Addressing Complex Memory for Exascale Systems and Applications (2018) (0)
- Session details: Matrix product for special platforms (2008) (0)
- Topic 2: Performance Evaluation (2004) (0)
- Tuyere (2018) (0)
- Understanding Performance Portability of Bioinformatics Applications in SYCL on an NVIDIA GPU (2022) (0)
- Programming Systems on the Road to Exascale Computing (2012) (0)
- Programming the EMU Architecture : Algorithm Design Considerations for Migratory-threads-based Systems (2018) (0)
- Workshop on Modeling & Simulation of Systems and Applications (2014) (0)
- SparseLU, A Novel Algorithm and Math Library for Sparse LU Factorization (2022) (0)
- neCODEC: nearline data compression for scientific applications (2013) (0)
- Chapter 7 Keeneland : Computational Science Using Heterogeneous GPU Computing (2013) (0)
- Topic 2: Performance Prediction and Evaluation (2005) (0)
- Evaluating Nonuniform Reduction in HIP and SYCL on GPUs (2022) (0)
- 2014 First Workshop on Accelerator Programming using Directives WACCPD 2014 Table of Contents (2014) (0)
- AsHES Keynote (2014) (0)
- LaRIS: Targeting Portability and Productivity for LAPACK Codes on Extreme Heterogeneous Systems by Using IRIS (2022) (0)
- High-performance computing: Successes, failures, and future directions (2005) (0)
- Design and analysis of CXL performance models for tightly-coupled heterogeneous computing (2022) (0)
- A Portable and Heterogeneous LU Factorization on IRIS (2022) (0)
- Local discovery of system architecture - application parameter sensitivity: an empirical technique for adaptive grid applications (2002) (0)
- Computational Challenges in Nuclear Weapons Simulation (2003) (0)
- MEPHESTO (2020) (0)
- MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs (2022) (0)
- Evaluating Unified Memory Performance in HIP (2022) (0)
- Preparing for Supercomputing's Sixth Wave (2016) (0)
- Preparing for the Future - Rethinking Proxy Applications (2022) (0)
- High Performance Adaptive Physics Refinement to Enable Large-Scale Tracking of Cancer Cell Trajectory (2022) (0)
- Keeneland: Computational Science Using Heterogeneous GPU Computing (2017) (0)
- Development of a parallel spectral element code using SPMD constructs (1996) (0)
- Enabling OpenACC programming on Multi-hybrid Accelerated with GPU and FPGA (2019) (0)
- Workshop on Modeling and Simulation of Systems and Applications, August 13-14, 2014, University of Washington, Seattle (2014) (0)
- 2015 Salishan Final Program (2015) (0)
- Ultra Low Latency Machine Learning for Scientific Edge Applications (2022) (0)
- Combining Aspen with Massively Parallel Simulation for Effective Exascale Co-Design (2018) (0)
- Comparing LLC-Memory Traffic between CPU and GPU Architectures (2021) (0)
- Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes (2023) (0)
- Techniques and optimizations for high performance computational steering (1998) (0)
- Glyphmaker: An Interactive, Programmerless Approach for Customizing, Exploring, and Analyzing Visual Data Representations (1993) (0)
- A Dynamic MPI Software Correctness Checking Tool (2005) (0)
- Flacc: Towards OpenACC support for Fortran in the LLVM Ecosystem (2021) (0)
- Modeling the Office of Science Ten Year Facilities Plan: The PERI Architecture Team (2009) (0)
- A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU (2022) (0)
- POSTER: Tango: An Optimizing Compiler for Just-In-Time RTL Simulation (2019) (0)
- Encoding Integers and Rationals on Neuromorphic Computers using Virtual Neuron (2022) (0)
- Performance and Scalability Analysis of Cray X1 Vectorization and Multistreaming Optimization (2005) (0)
- A survey on processing-in-memory techniques: Advances and challenges (2022) (0)
- Towards Exascale System : An Automatic Hardware Software Co-design Framework for Current and Future Architectures in Aspen (2016) (0)
- Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, San Jose, CA, USA, June 5, 2011 (2011) (0)
- Adrastea: An Efficient FPGA Design Environment for Heterogeneous Scientific Computing and Machine Learning (2022) (0)
- Performance and Communication Modeling for Exascale Proxy Architecture in Aspen (2018) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With Jeffrey S. Vetter?
Jeffrey S. Vetter is affiliated with the following schools: