Robert van de Geijn
#39,372
Most Influential Person Now
American computer scientist
Robert van de Geijn's AcademicInfluence.com Rankings
Robert van de Geijncomputer-science Degrees
Computer Science
#2490
World Rank
#2600
Historical Rank
Numerical Analysis
#15
World Rank
#16
Historical Rank
Database
#7839
World Rank
#8154
Historical Rank
Download Badge
Computer Science
Robert van de Geijn's Degrees
- PhD Computer Science University of California, Berkeley
- Masters Computer Science University of California, Berkeley
- Bachelors Computer Science University of California, Berkeley
Similar Degrees You Can Earn
Why Is Robert van de Geijn Influential?
(Suggest an Edit or Addition)According to Wikipedia, Robert A. van de Geijn is a Professor of Computer Sciences at the University of Texas at Austin. He received his B.S. in Mathematics and Computer Science from the University of Wisconsin–Madison and his Ph.D. in Applied Mathematics from the University of Maryland, College Park. His areas of interest include numerical analysis and parallel processing.
Robert van de Geijn's Published Works
Published Works
- Anatomy of high-performance matrix multiplication (2008) (690)
- SUMMA: scalable universal matrix multiplication algorithm (1995) (519)
- High-performance implementation of the level-3 BLAS (2008) (352)
- BLIS: A Framework for Rapidly Instantiating BLAS Functionality (2015) (256)
- Elemental: A New Framework for Distributed Memory Dense Matrix Computations (2013) (251)
- Collective communication: theory, practice, and experience (2007) (234)
- Using PLAPACK - parallel linear algebra package (1997) (226)
- A fast solution method for three‐dimensional many‐particle problems of linear elasticity (1998) (165)
- Programming matrix algorithms-by-blocks for thread-level parallelism (2009) (158)
- Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures (2007) (142)
- Parallel out-of-core computation and updating of the QR factorization (2005) (126)
- Solving dense linear systems on platforms with multiple hardware accelerators (2009) (122)
- SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks (2008) (116)
- Parallel Solution of Integral Equation-Based EM Problems in the Frequency Domain (2009) (104)
- The libflame Library for Dense Matrix Computations (2009) (97)
- BLAS (Basic Linear Algebra Subprograms) (2011) (93)
- A Note On Parallel Matrix Inversion (2000) (92)
- Families of algorithms related to the inversion of a Symmetric Positive Definite matrix (2008) (89)
- A Parallel Eigensolver for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations (2005) (87)
- Broadcasting on Meshes with Wormhole Routing (1996) (87)
- Global combine on mesh architectures with wormhole routing (1993) (85)
- Representing linear algebra algorithms in code: the FLAME application program interfaces (2005) (84)
- Two Dimensional Basic Linear Algebra Communication Subprograms (1993) (83)
- Distributed memory matrix-vector multiplication and conjugate gradient algorithms (1993) (81)
- Updating an LU Factorization with Pivoting (2008) (79)
- Scalability Issues Affecting the Design of a Dense Linear Algebra Library (1994) (77)
- Reduction to condensed form for the eigenvalue problem on distributed memory architectures (1992) (73)
- PLAPACK Parallel Linear Algebra Package Design Overview (1997) (72)
- Collective communication: theory, practice, and experience: Research Articles (2007) (70)
- Parallelizing the QR Algorithm for the Unsymmetric Algebraic Eigenvalue Problem: Myths and Reality (1996) (70)
- Building a high-performance collective communication library (1994) (69)
- Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures (2012) (64)
- A High Performance Parallel Strassen Implementation (1995) (61)
- The BLIS Framework (2016) (60)
- Collective communication on architectures that support simultaneous communication over multiple links (2006) (60)
- Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures (2008) (59)
- On optimizing collective communication (2004) (58)
- A Pipelined Broadcast for Multidimensional Meshes (1995) (56)
- A parallel multifrontal algorithm and its implementation (1997) (54)
- Fast Collective Communication Libraries, Please (1995) (53)
- Optimal Broadcasting in Mesh-Connected Architectures (1991) (52)
- Accumulating Householder transformations, revisited (2006) (50)
- Mechanical derivation and systematic analysis of correct linear algebra algorithms (2006) (48)
- Fault-tolerant high-performance matrix multiplication: theory and practice (2001) (48)
- Massively parallel computation for acoustical scattering problems using boundary element methods (1996) (46)
- Designing Linear Algebra Algorithms by Transformation: Mechanizing the Expert Developer (2012) (45)
- An API for Manipulating Matrices Stored by Blocks ∗ Tze Meng Low (2004) (45)
- Formal derivation of algorithms: The triangular sylvester equation (2003) (44)
- Satisfying your dependencies with SuperMatrix (2007) (42)
- Householder QR Factorization With Randomization for Column Pivoting (HQRRP) (2015) (42)
- LAPACK for Distributed Memory Architectures: Progress Report (1991) (40)
- The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations (2012) (39)
- Improving the performance of reduction to Hessenberg form (2006) (38)
- LAPACK Working Note 37: Two Dimensional Basic Linear Algebra Communication Subprograms (1991) (34)
- Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC (2012) (32)
- A high-performance, low-power linear algebra core (2011) (31)
- Level-3 BLAS on the TI C6678 Multi-core DSP (2012) (31)
- Deferred shifting schemes for parallel QR methods (1993) (30)
- POOCLAPACK: Parallel Out-of-Core Linear Algebra Package (1999) (30)
- Parallel out-of-core cholesky and QR factorizations with POOCLAPACK (2001) (28)
- Application of massively parallel computation to integral equation models of electromagnetic scattering (1994) (28)
- CollMark: MPI Collective Communication Benchmark (2000) (28)
- A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures (2012) (28)
- Parallel Matrix Distributions: Have we been doing it all wrong? (1995) (27)
- Toward Scalable Matrix Multiply on Multithreaded Architectures (2007) (26)
- Global Combine Algorithms for 2-D Meshes with Wormhole Routing (1995) (26)
- Introducing: The Libflame Library for Dense Matrix Computations (2009) (25)
- Rapid Development of High-Performance Out-of-Core Solvers (2004) (25)
- BLIS : A Framework for Generating BLAS-like Libraries FLAME Working (2012) (25)
- Scalable parallelization of FLAME code via the workqueuing model (2008) (24)
- Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance (2014) (24)
- Anatomy of a Parallel Out-of-Core Dense Linear Solver (1995) (23)
- Sparse direct factorizations through unassembled hyper-matrices (2010) (21)
- Level-3 BLAS on a GPU: Picking the low hanging fruit (2012) (20)
- Managing the complexity of lookahead for LU factorization with pivoting (2010) (20)
- Implementing the qr-algorithm on an array of processors (1987) (18)
- Performance and Scalability of Finite Element Analysis for Distributed Parallel Computation (1994) (18)
- A case study in mechanically deriving dense linear algebra code (2013) (18)
- Families of Algorithms for Reducing a Matrix to Condensed Form (2012) (18)
- Efficient Communication Primitives on Mesh Architectures with Hardware Routing (1993) (17)
- A Linear Algebra Core Design for Efficient Level-3 BLAS (2012) (17)
- On the Efficiency of Global Combine Algorithms for 2-D Meshes With WormholeRouting (1993) (16)
- Code Generation and Optimization of Distributed-Memory Dense Linear Algebra Kernels (2013) (16)
- Goal-Oriented and Modular Stability Analysis (2011) (16)
- Programming Algorithms-by-Blocks for Matrix Computations on Multithreaded Architectures FLAME Working Note # 29 (2008) (15)
- Storage Schemes for Parallel Eigenvalue Algorithms (1988) (15)
- Retargeting PLAPACK to clusters with hardware accelerators (2010) (15)
- Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications (2005) (14)
- Out-of-Core Computation of the QR Factorization on Multi-core Processors (2009) (14)
- A Parallel Linear Algebra Server for Matlab-like Environments (1998) (14)
- Deriving dense linear algebra libraries (2013) (13)
- Towards mechanical derivation of Krylov solver libraries (2010) (13)
- On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators (2012) (13)
- Implementation of Out-of-Core Cholesky and QR Factorizations with POOCLAPACK (2000) (13)
- Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator (2014) (13)
- Programming many‐core architectures ‐ a case study: dense matrix computations on the Intel single‐chip cloud computer processor (2012) (13)
- Solving “large” dense matrix problems on multi-core processors (2009) (12)
- Making Programming Synonymous with Programming for Linear Algebra Libraries FLAME Working Note # 31 (2008) (11)
- High performance dense linear algebra on a spatially distributed processor (2008) (11)
- Floating Point Architecture Extensions for Optimized Matrix Factorization (2013) (11)
- Design of scalable dense linear algebra libraries for multithreaded architectures: the LU factorization (2008) (11)
- Understanding performance stairs: elucidating heuristics (2014) (10)
- LAPACK Working Note 96: Scalable Universal Matrix Multiplication Algorithm (1995) (10)
- A block Jacobi method on a mesh of processors (1997) (10)
- SuperMatrix for the Factorization of Band Matrices FLAME Working Note # 27 (2007) (8)
- Design and Scheduling of an Algorithm-by-Blocks for the LU Factorization on Multithreaded Architectures FLAME Working Note # 26 (2007) (8)
- Retargeting PLAPACK to Clusters with Hardware Accelerators FLAME Working Note # 42 (2010) (8)
- Using desktop computers to solve large-scale dense linear algebra problems (2011) (8)
- 0 BLIS : A Framework for Rapid Instantiation of BLAS Functionality (2013) (7)
- Parallel performance and scalability for block preconditioned finite element (p) solution of viscous flow (1995) (7)
- LAPACK for Distributed Memory Architectures: The Next Generation (1993) (7)
- Scheduling algorithms‐by‐blocks on small clusters (2013) (7)
- Massively Parallel Linpack Benchmark on the Intel Touchstone Delta andIPSC/860 Systems (Progress Report) (1991) (7)
- An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization (2008) (6)
- Dense Matrix Computation on a Heterogenous Architecture: A Block Synchronous Approach (2012) (6)
- Efficient Matrix Inversion via Gauss-Jordan Elimination and ItsParallelization (1998) (6)
- Parallel Cholesky factorization of a block tridiagonal matrix (2002) (6)
- Automatic Derivation of Linear Algebra Algorithms with Application to Control Theory (2004) (6)
- Specialized Parallel Algorithms for Solving Lyapunov and Stein Equations (2001) (6)
- Parallelizing FLAME Code with OpenMP Task Queues (2004) (6)
- Using Graphics Processors to Accelerate the Solution of Out-of-Core Linear Systems (2009) (6)
- THE SCIENCE OF DERIVING STABILITY ANALYSES (2008) (6)
- Exploiting the symmetry on the Jacobi method on a mesh of processors (1996) (6)
- DxTer: An Extensible Tool for Optimal Dataflow Program Generation (2015) (5)
- Unleashing DSPs for General-Purpose HPC FLAME Working Note # 61 (2012) (5)
- Programming Many-Core Architectures-A Case Study : Dense Matrix Computations on the Intel SCC Processor FLAME Working Note # 55 (2011) (5)
- High-performance up-and-downdating via householder-like transformations (2011) (5)
- LAPACK Working Note 30: Reduction to Condensed Form for the Eigenvalue Problem on Distributed Memory Architectures (1991) (5)
- Basic Linear Algeblra Communication Subprograms (1991) (5)
- LAPACK Working Note 79: Parallelizing the Q R Algorithm for the Unsymmetric Algebraic Eigenvalue Problem: Myths and Reality (1994) (4)
- Transforming linear algebra libraries: From abstraction to parallelism (2010) (4)
- Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework (2021) (4)
- Householder QR Factorization: Adding Randomization for Column Pivoting. FLAME Working Note #78 (2015) (4)
- The Science of Programming High-Performance Linear Algebra Libraries (2007) (4)
- Parallel Solution of Selected Problems in Control Theory (1998) (4)
- Theory and Practice of Fusing Loops when Optimizing Parallel Dense Linear Algebra Operations FLAME Working Note # 64 Tze (2012) (3)
- Power-aware Dense Linear Algebra Implementations on Multi-core and Many-core Processors (2011) (3)
- Fast Parallel Kernels for Selected Problems in Control Theory (1999) (3)
- High performance computational kernels for selected segments of a p finite element code (1995) (3)
- A Novel Storage Scheme for Parallel Jacobi Methods (1988) (3)
- DSLs, DLA, DxT, and MDE in CSE (2013) (3)
- Dense linear solve on the Intel Touchstone DELTA System (1992) (3)
- Proof-Driven Derivation of Krylov Solver Libraries (2010) (3)
- Rapid Development of High-Performance Linear Algebra Libraries (2004) (3)
- Beautiful Parallel Code: Evolution vs. Intelligent Design (2008) (2)
- The Spike Factorization as Domain Decomposition Method; Equivalent and Variant Approaches (2012) (2)
- Interfaces are key (2013) (2)
- Out-of-core solution of linear systems on graphics processors (2009) (2)
- Transforming Linear Algebra Libraries : From Abstraction to High Performance (2008) (2)
- Formal Correctness and Stability of Dense Linear Algebra Algorithms (2005) (2)
- Mechanizing the expert dense linear algebra developer (2012) (2)
- Algorithms for Reducing a Matrix to Condensed Form FLAME Working Note #53 (2012) (2)
- Code Generation to Aid Parallel Code Development (2014) (1)
- Collective Communication : Theory , Practice , and Experience FLAME Working Note # 22 (2006) (1)
- Application Interface to Parallel Dense Matrix Libraries : Just let me solve my problem ! (2006) (1)
- Notes on the Symmetric QR Algorithm (2014) (1)
- Automation in Dense Linear Algebra (2008) (1)
- A Run-Time System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures FLAME Working Note # 43 (2010) (1)
- Towards a High-Performance , Low-Power Linear Algebra Processor (2010) (1)
- Beautiful Parallel Code : Evolution vs . Intelligent Design FLAME Working Note # 34 (2008) (1)
- Computer methods in applied mechanics and engineering A parallel multifrontal algorithm and its implementation (0)
- Notes on Vector and Matrix Norms (2014) (0)
- Notes on Eigenvalue Problem (2010) (0)
- Formal Derivation of LU Factorization with Pivoting (2023) (0)
- TR-07-02 Applying Formal Derivation Techiques to Krylov Subspace Methods (2009) (0)
- UPDATING AN LU FACTORIZATION AND ITS APPLICATION TO SCALABLE OUT-OF-CORE COMPUTATION (2005) (0)
- Authors’ Biographies/Index (2022) (0)
- Machine independent parallel numerical algorithms (1989) (0)
- The science of programming dense linear algebra libraries (2007) (0)
- Deriving Linear Algebra Libraries FLAME Working Note # 57 (2011) (0)
- A JACOBI METHOD BY BLOCKS ON AMESH OF PROCESSORSDomingo (1997) (0)
- Foundations of Programming Linear Algebra Algorithms on SMP and Multicore Systems (2006) (0)
- The Rigorous Calculation of the Covariance Matrix for Arbitrarily Large Inverse Problems (2007) (0)
- An asymptotically 100% efficient parallel implementation of the nonsymmetric QR algorithm (1990) (0)
- Measuring and modelling the temperature profile during biomass fixed bed combustion with special attention on the release of species from the fuel bed (2001) (0)
- HIGH-PERFORMANCE AND PARALLEL INVERSION OF A SYMMETRIC POSITIVE DEFINITE MATRIX (2005) (0)
- Notes on Numerical Stability (2014) (0)
- Notes on Numerical Stability (2014) (0)
- Supporting mixed-datatype matrix multiplication within the BLIS framework (2019) (0)
- Notes on Eigenvalues and Eigenvectors (2014) (0)
- Parallel MoM Using Higher Order Basis Functions and PLAPACK Out-of-Core Solver for a Challenging Vivaldi Array (2008) (0)
- Code Generation of Optimized Distributed-Memory Dense Linear Algebra Kernels (2012) (0)
- Attaining higher performance in collective communication (2004) (0)
- Application Driven Fast Summation Methods (1999) (0)
- Updating an LU factorization with Pivoting FLAME Working Note # 21 (2006) (0)
- Deriving Correct High-Performance Algorithms FLAME Working Note (2017) (0)
- TR-07-02 Sparse Direct Factorizations through Unassembled Hyper-Matrices (2007) (0)
- TR-1002 Proof-Driven Derivation of Krylov Solver Libraries (2010) (0)
- REPRESENTATIONS FOR THE THEORY AND PRACTICE OF HIGH-PERFORMANCE DENSE LINEAR ALGEBRA ALGORITHMS (2007) (0)
- A FAST SOLUTION METHOD FOR THREE-DIMENSIONALMANY-PARTICLE PROBLEMS OF LINEAR ELASTICITYYuhong (1998) (0)
- GEMMFIP: Unifying GEMM in BLIS (2023) (0)
- General/Program Co-Chairs: (2008) (0)
- Notes on the Singular Value Decomposition (2014) (0)
This paper list is powered by the following services:
Other Resources About Robert van de Geijn
What Schools Are Affiliated With Robert van de Geijn?
Robert van de Geijn is affiliated with the following schools: