Richard Vuduc | Academic Influence

Richard Vuduc's AcademicInfluence.com Rankings

Richard Vuduc

Computer Science

#4721

World Rank

#4984

Historical Rank

Parallel Computing

#65

World Rank

#67

Historical Rank

Database

#8228

World Rank

#8584

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Richard Vuduc's Degrees

PhD Computer Science Stanford University
Masters Computer Science Stanford University
Bachelors Computer Science University of California, Berkeley

Similar Degrees You Can Earn

Why Is Richard Vuduc Influential?

(Suggest an Edit or Addition)

According to Wikipedia, Richard Vuduc is a tenured professor of computer science at the Georgia Institute of Technology. His research lab, The HPC Garage, studies high-performance computing, scientific computing, parallel algorithms, modeling, and engineering. He is a member of the Association for Computing Machinery . As of 2022, Vuduc serves as Vice President of the SIAM Activity Group on Supercomputing. He has co-authored over 200 articles in peer-reviewed journals and conferences.

(See a Problem?)

Richard Vuduc's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Optimization of sparse matrix-vector multiplication on emerging multicore platforms (2007) (826)
SUSTain (2018) (558)
OSKI: A Library of Automatically Tuned Sparse Matrix Kernels (2005) (556)
Model-driven autotuning of sparse matrix-vector multiply on GPUs (2010) (437)
Sparsity: Optimization Framework for Sparse Matrix Kernels (2004) (361)
Automatic performance tuning of sparse matrix kernels (2003) (288)
Self-Adapting Linear Algebra Algorithms and Software (2005) (234)
A performance analysis framework for identifying potential benefits in GPGPU applications (2012) (208)
A massively parallel adaptive fast-multipole method on heterogeneous architectures (2009) (184)
Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure (2005) (165)
Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures (2010) (165)
Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply (2002) (154)
A Roofline Model of Energy (2013) (144)
Falcon: fault localization in concurrent programs (2010) (140)
On the limits of GPU acceleration (2010) (136)
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications (2010) (133)
POET: Parameterized Optimizations for Empirical Tuning (2007) (131)
Statistical Models for Empirical Search-Based Performance Tuning (2004) (118)
When cache blocking of sparse matrix vector multiply works and why (2007) (116)
When Prefetching Works, When It Doesn’t, and Why (2012) (116)
Self-stabilizing iterative solvers (2013) (96)
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems (2009) (88)
Automated Empirical Optimization (2011) (84)
Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks (2014) (76)
Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply (2004) (71)
Performance evaluation of concurrent collections on high-performance multicore computing systems (2010) (69)
HiCOO: Hierarchical Storage of Sparse Tensors (2018) (68)
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures (2010) (65)
An input-adaptive and in-place approach to dense tensor-times-matrix multiply (2015) (63)
Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization (2009) (62)
SPARTan: Scalable PARAFAC2 for Large & Sparse Data (2017) (61)
Autotuning in High-Performance Computing Applications (2018) (60)
On the communication complexity of 3D FFTs and its implications for Exascale (2012) (59)
Communicating Software Architecture using a Unified Single-View Visualization (2007) (51)
Model-Driven Sparse CP Decomposition for Higher-Order Tensors (2017) (51)
Undifferentiated facial electromyography responses to dynamic, audio-visual emotion displays in individuals with autism spectrum disorders. (2013) (50)
A Distributed CPU-GPU Sparse Direct Solver (2014) (47)
Statistical Models for Automatic Performance Tuning (2001) (45)
A Unified Approach for Localizing Non-deadlock Concurrency Bugs (2012) (44)
SWAMI: a framework for collaborative filtering algorithm development and evaluation. (2000) (42)
Sparse Hierarchical Tucker Factorization and Its Application to Healthcare (2015) (41)
Direct N-body Kernels for Multicore Platforms (2009) (41)
CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems (2015) (41)
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method (2010) (38)
Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU) (2012) (32)
Balance Principles for Algorithm-Architecture Co-Design (2011) (32)
Improving distributed memory applications testing by message perturbation (2006) (32)
Load-Balanced Sparse MTTKRP on GPUs (2019) (32)
Optimizing Sparse Tensor Times Matrix on Multi-core and Many-Core Architectures (2016) (31)
Image segmentation using fractal dimension (2002) (29)
SWAMI (poster session): a framework for collaborative filtering algorithm development and evaluation (2000) (29)
The Backstroke framework for source level reverse computation applied to parallel discrete event simulation (2011) (27)
A Distributed Kernel Summation Framework for General-Dimension Machine Learning (2012) (26)
A Theoretical Framework for Algorithm-Architecture Co-design (2013) (26)
Improving the energy efficiency of Big Cores (2014) (26)
Performance Modeling and Analysis of Cache Blocking in Sparse Matrix Vector Multiply (2004) (26)
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems (2018) (25)
Optimizing sparse tensor times matrix on GPUs (2019) (24)
A type theory for probability density functions (2012) (24)
Branch-Avoiding Graph Algorithms (2014) (23)
What GPU Computing Means for High-End Systems (2011) (22)
Efficient and effective sparse tensor reordering (2019) (22)
Performance Analysis and Tuning for General Purpose Graphics Processing Units (2012) (22)
A CPU: GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method (2014) (21)
The Optimized Sparse Kernel Interface (OSKI) Library User's Guide for Version 1.0.1h (2007) (20)
An Initial Characterization of the Emu Chick (2018) (19)
[Personal health]. (1969) (19)
[Personal health]. (1969) (19)
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW (2000) (18)
Parameterizing loop fusion for automated empirical tuning (2005) (18)
Memory Hierarchy Optimizations and Performance ounds for Sparse A (2003) (17)
Annotating user-defined abstractions for optimization (2005) (17)
Performance Optimizations and Bounds for Sparse Symmetric Matrix-Multiple Vector Multiply (1985) (17)
Memory Hierarchy Optimizations and Performance Bounds for Sparse A T Ax (2003) (16)
Tool Support for Inspecting the Code Quality of HPC Applications (2007) (16)
A New Method for Program Inversion (2012) (15)
Modern Accelerator Technologies for Geographic Information Science (2013) (15)
Griffin: grouping suspicious memory-access patterns to improve understanding of concurrency bugs (2013) (15)
Temporal phenotyping of medically complex children via PARAFAC2 tensor factorization (2019) (15)
A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems (2015) (15)
SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping (2018) (15)
Efficient Communications in Training Large Scale Neural Networks (2016) (15)
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices (2018) (14)
Techniques for specifying bug patterns (2007) (13)
A supernodal all-pairs shortest path algorithm (2020) (13)
Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines (2017) (13)
Brief announcement: towards a communication optimal fast multipole method and its implications at exascale (2012) (13)
Optimizing the computation of n-point correlations on large-scale astronomical data (2012) (12)
A Brief History and Introduction to GPGPU (2013) (12)
Fast sensitivity computations for trajectory optimization (2010) (12)
GraSP: distributed streaming graph partitioning (2015) (12)
Algorithmic Skeletons (2011) (11)
Hybrid Dynamic Trees for Extreme-Resolution 3D Sparse Data Modeling (2016) (11)
Polyadic Regression and its Application to Chemogenomics (2017) (10)
How much (execution) time and energy does my algorithm cost? (2013) (10)
A GPU-parallel construction of volumetric tree (2015) (10)
Methods for High-Throughput Computation of Elementary Functions (2013) (9)
Programming Strategies for Irregular Algorithms on the Emu Chick (2018) (9)
A massively parallel adaptive fast multipole method on heterogeneous architectures (2012) (9)
CUP: Cluster Pruning for Compressing Deep Neural Networks (2019) (9)
A Microbenchmark Characterization of the Emu Chick (2018) (8)
A Graphical Approach for Freeform Surface Offsetting With GPU Acceleration for Subtractive 3D Printing (2016) (8)
Prospects for scalable 3D FFTs on heterogeneous exascale systems (2011) (8)
Statistical Modeling of Feedback Data in an Automatic Tuning System (2000) (8)
Adaptive Deep Path: Efficient Coverage of a Known Environment under Various Configurations (2019) (8)
CA-SVM : Communication-Avoiding Support Vector Machines on Clusters (2016) (7)
Applying the concurrent collections programming model to asynchronous parallel dense linear algebra (2010) (7)
A Self-Correcting Connected Components Algorithm (2016) (7)
A Wavelet Collocation Method for Solving PDEs (2001) (6)
Analyzing the Energy Efficiency of the Fast Multipole Method Using a DVFS-Aware Energy Model (2016) (6)
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming (2013) (6)
Understanding the design trade-offs among current multicore systems for numerical computations (2009) (6)
International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, New Orleans, LA, USA, November 16-21, 2014 (2014) (6)
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems (2019) (5)
Automatic Generation of High-Performance FFT Kernels on Arm and X86 CPUs (2020) (5)
Distributed-Memory Parallel Symmetric Nonnegative Matrix Factorization (2020) (4)
A communication-avoiding 3D sparse triangular solver (2019) (4)
Step Ring Based 3D Path Planning via GPU Simulation for Subtractive 3D Printing (2016) (4)
Synthesizing Loops for Program Inversion (2012) (4)
Spatter: A Benchmark Suite for Evaluating Sparse Access Patterns (2018) (4)
Support for Whole-Program Analysis and the Verification of the One-Definition Rule in C++ (2006) (4)
An Extensible Open-Source Compiler Infrastructure for Testing (2005) (4)
Modeling the Power Variability of Core Speed Scaling on Homogeneous Multicore Systems (2017) (4)
Intrepydd: performance, productivity, and portability for data science application kernels (2020) (3)
Atomic Operations (2011) (3)
Scalable Knowledge Graph Analytics at 136 Petaflop/s (2020) (3)
Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4 (2009) (3)
Modeling and Analysis for Performance and Power (2012) (3)
Step Ring-Based Three-Dimensional Path Planning Via Graphics Processing Unit Simulation for Subtractive Three-Dimensional Printing (2017) (2)
Analyzing and Visualizing Whole Program Architectures (2007) (2)
Toward interactive statistical modeling (2010) (2)
Spatter: A Tool for Evaluating Gather / Scatter Performance (2018) (2)
Wanted: Floating-Point Add Round-off Error instruction (2016) (2)
An interface for a self-optimizing sparse matrix kernel library (2005) (2)
Evaluating Gather and Scatter Performance on CPUs and GPUs (2020) (2)
Sustainable Software Development for Next-Gen Sequencing (NGS) Bioinformatics on Emerging Platforms (2013) (2)
A distributed kernel summation framework for general‐dimension machine learning (2014) (2)
Toward a Theory of Algorithm-Architecture Co-design (2012) (2)
Architectural Visualization of C/C++ Source Code for Program Comprehension (2006) (1)
Communication-avoiding kernel ridge regression on parallel and distributed systems (2021) (1)
Proceedings of the First International Workshop on Post Moore ' s Era Supercomputing (2016) (1)
An Energy-Efficient Single-Source Shortest Path Algorithm (2018) (1)
Nimble GNN Embedding with Tensor-Train Decomposition (2022) (1)
Communication-Optimal Parallel N-body Solvers (2012) (1)
POET : Parameterized Optimization for Empirical Tuning (2007) (1)
Polyadic Regression and its Application to Chemogenomics-Supplementary Material Ioakeim Perros (2017) (1)
CA-SVM : Communication-Avoiding Parallel Support Vector Machines on Distributed Systems (2015) (1)
Numerical Algorithms with Tunable Parallelism (2008) (1)
Courses in High-performance Computing for Scientists and Engineers (2012) (1)
Characterizing Application Runtime Behavior from System Logs and Metrics (2011) (1)
Faster parallel collision detection at high resolution for CNC milling applications (2019) (1)
A GPU-Accelerated Freeform Surface Offsetting Method for High-Resolution Subtractive 3D Printing (Machining) (2018) (1)
Scalable All-pairs Shortest Paths for Huge Graphs on Multi-GPU Clusters (2020) (1)
An interface for multidimensional arrays in Arkouda (2021) (1)
Optimizations & Bounds for Sparse Symmetric Matrix-Vector Multiply (2004) (0)
Online model swapping for architectural simulation (2020) (0)
Introduction for Special Issue on Autotuning (2013) (0)
Ab Initio Molecular Dynamics (2011) (0)
“Smarter” NICs for faster molecular dynamics: a case study (2022) (0)
Furious . js : a Model for Offloading Compute-Intensive JavaScript Applications (2015) (0)
Comprehending Software Architecture using a Single-View Visualization (2007) (0)
Superfluorescence in the presence of inhomogeneousbroadening and relaxation (1997) (0)
Message from the IISWC 2015 General Co-Chairs (2015) (0)
HPPAC Workshop Introduction (2017) (0)
Self-stabilizing Connected Components (2019) (0)
Two Algorithms for Sorting On Heterogeneous Clusters (2012) (0)
A Simple Methodology for Computing Families of Algorithms (2018) (0)
Scalable Knowledge-Graph Analytics at 136 Petaflops/s – Data Readme (2020) (0)
Exaflops Biomedical Knowledge Graph Analytics (2022) (0)
P L ] 2 0 A ug 2 01 8 A Simple Methodology for Computing Families of Algorithms FLAME Working Note # 87 (2018) (0)
Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms Permalink (2008) (0)
ORCA: Outlier detection and Robust Clustering for Attributed graphs (2021) (0)
Automated Performance Tuning (2011) (0)
Is it Nemo or Dory? Fast and accurate object detection for IoT and edge devices (2021) (0)
Parameterization and Search-space Exploitation of Loop Fusion (2005) (0)
Max orientation coverage: efficient path planning to avoid collisions in the CNC milling of 3D objects (2020) (0)
Unconventional wisdom in multicore computing (2010) (0)
AAS 09-337 FAST SENSITIVITY COMPUTATIONS FOR TRAJECTORY OPTIMIZATION (2009) (0)
ParaGraph: An application-simulator interface and toolkit for hardware-software co-design (2022) (0)
SPARTan (2017) (0)
The Sixth International Workshop on Automatic Performance Tuning (iWAPT2011) (2011) (0)
Jack, The Autotuner (2022) (0)
2 Summary of Initial Proposal 3.1 Concurrent Collections (cnc): a New Programming Model for Hpc 3.3 Tunable " Fast-and-loose " Synchronization (2010) (0)
Critique of “MemXCT: Memory-Centric X-Ray CT Reconstruction With Massive Parallelization” by SCC Team From Georgia Tech (2022) (0)
Performance Optimizations and Bounds for Sparse Matrix Kernels (2002) (0)
Recovery of superfluorescence in inhomogeneously broadened systems through rapid relaxation (1997) (0)
Algorithms and software with turnable parallelism (2010) (0)

This paper list is powered by the following services:

Other Resources About Richard Vuduc

en.wikipedia.org

What Schools Are Affiliated With Richard Vuduc?

Richard Vuduc is affiliated with the following schools:

Image Attributions

Image Source for Richard Vuduc