Michael W. M. Mahoney

Michael W. M. Mahoney's AcademicInfluence.com Rankings

Computer Science

#5571

World Rank

#5884

Historical Rank

Numerical Analysis

#28

World Rank

#30

Historical Rank

Machine Learning

#1531

World Rank

#1553

Historical Rank

computer-science Degrees

Michael W. M. Mahoney

Mathematics

#6190

World Rank

#8622

Historical Rank

Linear Algebra

#10

World Rank

#12

Historical Rank

Applied Mathematics

#202

World Rank

#222

Historical Rank

Measure Theory

#1188

World Rank

#1514

Historical Rank

mathematics Degrees

Download Badge

Computer Science
Mathematics

Michael W. M. Mahoney's Degrees

PhD Applied Mathematics California Institute of Technology
Bachelors Mathematics California Institute of Technology

Similar Degrees You Can Earn

Best Online Bachelor's in Math 2025

Why Is Michael W. M. Mahoney Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Michael W. M. Mahoney's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions (2000) (1879)
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters (2008) (1806)
Empirical comparison of algorithms for network community detection (2010) (1030)
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning (2005) (987)
Statistical properties of community structure in large social and information networks (2008) (962)
Randomized Algorithms for Matrices and Data (2011) (898)
CUR matrix decompositions for improved data analysis (2009) (714)
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix (2006) (534)
Fast approximation of matrix coherence and statistical leverage (2011) (460)
Relative-Error CUR Matrix Decompositions (2007) (450)
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication (2006) (443)
Faster least squares approximation (2007) (435)
An improved approximation algorithm for the column subset selection problem (2008) (374)
Revisiting the Nystrom Method for Improved Large-scale Machine Learning (2013) (362)
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT (2019) (337)
Sampling algorithms for l2 regression and applications (2006) (333)
Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition (2006) (314)
Diffusion constant of the TIP5P model of liquid water (2001) (303)
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations (2007) (292)
A statistical perspective on algorithmic leveraging (2013) (270)
Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression (2012) (257)
Fast Randomized Kernel Ridge Regression with Statistical Guarantees (2015) (251)
Feature selection methods for text classification (2007) (207)
Randomized Dimensionality Reduction for $k$ -Means Clustering (2011) (204)
RandNLA: randomized numerical linear algebra (2016) (195)
Sampling algorithms and coresets for ℓp regression (2007) (176)
A Berkeley View of Systems Challenges for AI (2017) (171)
Newton-type methods for non-convex optimization under inexact Hessian information (2017) (160)
Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels (2014) (148)
Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks (2014) (148)
PyHessian: Neural Networks Through the Lens of the Hessian (2019) (144)
Unsupervised Feature Selection for the $k$-means Clustering Problem (2009) (144)
Tensor-CUR decompositions for tensor-based data (2006) (142)
LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems (2011) (140)
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries (2018) (127)
Unsupervised feature selection for principal components analysis (2008) (123)
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study (2017) (117)
A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally (2009) (110)
Tree-Like Structure in Large Social and Information Networks (2013) (107)
A randomized algorithm for a tensor-based generalization of the singular value decomposition (2007) (106)
Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction (2019) (105)
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning (2018) (104)
Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods (2006) (98)
Sub-sampled Newton methods (2018) (98)
Sub-sampled Newton Methods with Non-uniform Sampling (2016) (94)
The Fast Cauchy Transform and Faster Robust Linear Regression (2012) (94)
Quantum, intramolecular flexibility, and polarizability effects on the reproduction of the density anomaly of liquid water by simple potential functions (2001) (93)
Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds (2017) (93)
GIANT: Globally Improved Approximate Newton Method for Distributed Optimization (2017) (89)
Sub-Sampled Newton Methods I: Globally Convergent Algorithms (2016) (87)
Fast Randomized Kernel Methods With Statistical Guarantees (2014) (84)
Sub-Sampled Newton Methods II: Local Convergence Rates (2016) (83)
On the Hyperbolicity of Small-World and Treelike Random Graphs (2012) (83)
Traditional and Heavy-Tailed Self Regularization in Neural Network Models (2019) (82)
A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares (2014) (78)
Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging (2017) (78)
Skip-Gram − Zipf + Uniform = Vector Additivity (2017) (70)
A high accuracy microwave ranging system for industrial applications (1993) (69)
On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent (2018) (65)
Forecasting Sequential Data using Consistent Koopman Autoencoders (2020) (64)
Lectures on Randomized Numerical Linear Algebra (2017) (64)
Exact expressions for double descent and implicit regularization via surrogate random design (2019) (60)
A local perspective on community structure in multilayer networks (2015) (58)
Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments (2015) (57)
Quantile Regression for Large-Scale Applications (2013) (56)
A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent (2020) (53)
Intra- and interpopulation genotype reconstruction from tagging SNPs. (2006) (52)
Multiplicative noise and heavy tails in stochastic optimization (2020) (51)
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior (2017) (50)
Random Laplace Feature Maps for Semigroup Kernels on Histograms (2014) (49)
PowerNorm: Rethinking Batch Normalization in Transformers (2020) (45)
Matrix factorizations at scale: A comparison of scientific data analytics in spark and C+MPI using three case studies (2016) (45)
Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data (2020) (44)
Trust Region Based Adversarial Attack on Neural Networks (2018) (43)
CUR from a Sparse Optimization Viewpoint (2010) (40)
Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms (2015) (40)
Implementing regularization implicitly via approximate eigenvector computation (2010) (40)
Inexact Nonconvex Newton-Type Methods (2018) (39)
Future Directions in Tensor-Based Computation and Modeling (2009) (39)
Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods (2006) (37)
Capacity Releasing Diffusion for Speed and Locality (2017) (34)
A Simple and Strongly-Local Flow-Based Method for Cut Improvement (2016) (34)
Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions. (2015) (34)
Approximate computation and implicit regularization for very large-scale data analysis (2012) (34)
Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow (2014) (33)
Theatres of Struggle and the End of Apartheid (review) (2005) (32)
Weighted SGD for ℓp Regression with Randomized Preconditioning (2016) (32)
Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data (2019) (31)
Large batch size training of neural networks with adversarial training and second-order information (2018) (31)
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks (2019) (30)
Unified Acceleration Method for Packing and Covering Problems via Diameter Reduction (2015) (29)
Localization on low-order eigenvectors of data matrices (2011) (29)
Tree decompositions and social graphs (2014) (28)
Rapid Mixing of Several Markov Chains for a Hard-Core Model (2003) (28)
Effective Resistances, Statistical Leverage, and Applications to Linear Equation Solving (2010) (28)
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms (2020) (28)
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis (2010) (27)
Group Collaborative Representation for Image Set Classification (2019) (26)
OverSketched Newton: Fast Convex Optimization for Serverless Systems (2019) (25)
Newton-MR: Newton's Method Without Smoothness or Convexity (2018) (24)
Approximating a Gram Matrix for Improved Kernel-Based Learning (Extended Abstract) (2005) (23)
Lecture Notes on Randomized Linear Algebra (2016) (23)
rCUR: an R package for CUR matrix decomposition (2012) (23)
Open Problems in Data Streams, Property Testing, and Related Topics (2011) (21)
Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap (2018) (21)
Robust Regression on MapReduce (2013) (20)
Inefficiency of K-FAC for Large Batch Size Training (2019) (20)
Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity (2013) (19)
Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression (2019) (19)
Distributed estimation of the inverse Hessian by determinantal averaging (2019) (19)
Bayesian experimental design using regularized determinantal point processes (2019) (18)
Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract) (2021) (18)
Signal Processing for Big Data [From the Guest Editors] (2014) (18)
Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization (2020) (18)
Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method (2015) (17)
Stochastic Dimensionality Reduction for K-means Clustering (2011) (17)
Structural Properties Underlying High-Quality Randomized Numerical Linear Algebra Algorithms (2016) (17)
Algorithmic and statistical challenges in modern largescale data analysis are the focus of MMDS 2008 (2008) (17)
Optimal Subsampling Approaches for Large Sample Linear Regression (2015) (17)
Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning (2015) (17)
Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction (2017) (17)
LASAGNE: Locality and Structure Aware Graph Node Embedding (2017) (16)
Variational perspective on local graph clustering (2016) (16)
Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares (2015) (16)
An Optimization Approach to Locally-Biased Graph Algorithms (2016) (16)
Rethinking Batch Normalization in Transformers (2020) (16)
Maturation of Cerebellar Purkinje Cell Population Activity during Postnatal Refinement of Climbing Fiber Network. (2017) (16)
GPU Accelerated Sub-Sampled Newton's Method for Convex Classification Problems (2019) (15)
Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist (2018) (13)
Approximating Higher-Order Distances Using Random Projections (2010) (13)
Hessian Eigenspectra of More Realistic Nonlinear Models (2021) (13)
Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching (2019) (13)
Out-of-sample extension of graph adjacency spectral embedding (2018) (13)
Alchemist: An Apache Spark ⇔ MPI interface (2018) (13)
Structured Block Basis Factorization for Scalable Kernel Matrix Evaluation (2015) (12)
Empirical Evaluation of Graph Partitioning Using Spectral Embeddings and Flow (2009) (12)
Semi-supervised eigenvectors for large-scale locally-biased learning (2013) (12)
Sparse Quantized Spectral Clustering (2020) (12)
A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication (2017) (12)
Avoiding communication in primal and dual block coordinate descent methods (2016) (12)
MAPPING THE SIMILARITIES OF SPECTRA: GLOBAL AND LOCALLY-BIASED APPROACHES TO SDSS GALAXIES (2016) (12)
Regularized Laplacian Estimation and Fast Eigenvector Approximation (2011) (12)
Approximating the Solution to Mixed Packing and Covering LPs in parallel Õ ( − 3 ) time (2016) (11)
Approximating the Solution to Mixed Packing and Covering LPs in Parallel O˜(epsilon^{-3}) Time (2016) (11)
Efficient Genomewide Selection of PCA‐Correlated tSNPs for Genotype Imputation (2011) (11)
Sparse sketches with small inversion bias (2020) (11)
Rapid estimation of electronic degrees of freedom in Monte Carlo calculations for polarizable models of liquid water (2001) (11)
Exploiting Optimization for Local Graph Clustering (2016) (11)
Discrete representations of the protein Cα chain (1997) (11)
Mining Large Graphs (2016) (11)
Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method (2020) (10)
A Multi-Platform Evaluation of the Randomized CX Low-Rank Matrix Factorization in Spark (2016) (10)
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update (2021) (9)
Faster Parallel Solver for Positive Linear Programs via Dynamically-Bucketed Selective Coordinate Descent (2015) (9)
Parameter Re-Initialization through Cyclical Batch Size Schedules (2018) (8)
GPU Accelerated Sub-Sampled Newton's Method (2018) (8)
Sampling Sub-problems of Heterogeneous Max-cut Problems and Approximation Algorithms (2005) (8)
DCAR: A Discriminative and Compact Audio Representation for Audio Processing (2017) (7)
A Spectral Algorithm for Improving Graph Partitions (2009) (7)
JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks (2019) (7)
Statistical guarantees for local graph clustering (2019) (7)
Semi-supervised Eigenvectors for Locally-biased Learning (2012) (6)
Feature-distributed sparse regression: a screen-and-clean approach (2016) (6)
Newton-type methods for non-convex optimization under inexact Hessian information (2019) (6)
A Short Introduction to Local Graph Clustering Methods and Software (2018) (6)
A Discriminative and Compact Audio Representation for Event Detection (2016) (6)
Asymptotic Convergence Rate and Statistical Inference for Stochastic Sequential Quadratic Programming (2022) (6)
Lecture Notes on Spectral Graph Methods (2016) (6)
The Mathematics of Data (2018) (6)
Peer-Mediated Instruction and Activity Schedules: Tools for Providing Academic Support for Students With ASD (2019) (5)
Distributed Second-order Convex Optimization (2018) (5)
GACT: Activation Compressed Training for Generic Network Architectures (2022) (5)
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence (2022) (4)
Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization (2017) (4)
A Spectral Algorithm for Improving Graph Partitions with Applications to Exploring Data Graphs Locally (2009) (4)
The Fast Cauchy Transform: with Applications to Basis Construction, Regression, and Subspace Approximation in L1 (2012) (4)
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data (2022) (4)
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers (2021) (4)
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information (2021) (4)
terrainr: An R package for creating immersive virtual environments (2022) (4)
Learning with Spectral Kernels and Heavy-Tailed Data (2009) (3)
A Differential Geometry Perspective on Orthogonal Recurrent Models (2021) (3)
On the Hyperbolicity of Small-World Networks and Tree-Like Graphs (2012) (3)
Social Discrete Choice Models (2017) (3)
Randomized algorithms for matrices and massive data sets (2006) (3)
Newton-MR: Inexact Newton Method with minimum residual sub-problem solver (2018) (3)
Inexact Newton-CG algorithms with complexity guarantees (2021) (3)
Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks (2019) (3)
MMDS 2008 : Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis are the Focus (2009) (2)
Workshop on Algorithms for Modern Massive Datasets (2006) (2)
Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems (2022) (2)
Good Classifiers are Abundant in the Interpolating Regime (2021) (2)
Parallel and Communication Avoiding Least Angle Regression (2019) (2)
Discrete representations of the protein C alpha chain. (1997) (2)
Stochastic Normalizing Flows (2020) (2)
Good linear classifiers are abundant in the interpolating regime (2020) (2)
On Linear Convergence of Weighted Kernel Herding (2019) (2)
Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings (2019) (2)
Supplementary Material : Large-scale community structurein social and information networks (2009) (1)
The Difficulties of Addressing Interdisciplinary Challenges at the Foundations of Data Science (2019) (1)
Stochastic continuous normalizing flows: training SDEs as ODEs (2021) (1)
THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS (2017) (1)
GPU Accelerated Sub-Sampled Newton\textsf{'}s Method (2018) (1)
DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection (2016) (1)
MMDS 2008 : Algorithmic and Statistical Challenges in Modern Large-Scale Data Analysis , Part I (2009) (1)
Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms (2008) (1)
SIGACT news algorithms column: computation in large-scale scientific and internet data applications is a focus of MMDS 2010 (2010) (1)
Cornell University Autonomous Underwater Vehicle : Design and Implementation of the Argo AUV (2011) (1)
Residual Networks as Nonlinear Systems: Stability Analysis using Linearization (2019) (1)
Geometric rates of convergence for kernel-based sampling algorithms (2019) (1)
Bridging the Gap Between Numerical Linear Algebra , Theoretical Computer Science , and Data Applications By Gene (2006) (1)
rCUR: an R package for CUR matrix decomposition (2012) (1)
Sampling subproblems of heterogeneous Max‐Cut problems and approximation algorithms (2008) (1)
AutoIP: A United Framework to Integrate Physics into Gaussian Processes (2022) (1)
A new spin on an old algorithm: technical perspective (2014) (1)
Best IVR Number Providers // 2019's Top IVR Number Software Solutions (2019) (0)
Variational perspective on local graph clustering (2017) (0)
The computational statistical mechanics of simple models of liquid water (2000) (0)
ropensci/terrainr: terrainr v 0.5.0 (2021) (0)
Check Toll Free and Local Number Portability in Asia (2019 Guide) (2018) (0)
07071 Report on Dagstuhl Seminar -- Web Information Retrieval and Linear Algebra Algorithms (2007) (0)
Recent Advances in Randomized Numerical Linear Algebra (NII Shonan Meeting 2016-10) (2016) (0)
ICML 2010 Tutorial: Geometric Tools for Identifying Structure in Large Social and Information Networks (2010) (0)
An Empirical Exploration of Gradient Correlations in Deep Learning (2018) (0)
Summary: Empirical Comparison of Algorithms for Network Community De- Tection. (2010) (2013) (0)
Easily Obtain Spatial Data and Make Better Maps [R package spacey version 0.1.1] (2020) (0)
Lower Extremity EMG during Stair Ascent Following TKA with Four Different Surgical Approaches (2010) (0)
The Sky Above The Clouds (2022) (0)
Unsupervised Learning Through Randomized Algorithms for High-Volume High-Velocity Data (ULTRA-HV). (2018) (0)
Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software (2023) (0)
Implant rachidien avec élément d'extension flexible (2010) (0)
Forecasting Sequential Data Using Consistent Koopman Autoencoders — Supplementary Materials — (2020) (0)
rCUR: an R package for CUR matrix (2012) (0)
Second Order Machine Learning (2017) (0)
Fast Feature Selection with Fairness Constraints (2022) (0)
Running Alchemist on Cray XC and CS Series Supercomputers: Dask and PySpark Interfaces, Deployment Options, and Data Transfer Times (2019) (0)
Scalable Matrix Algorithms for Interactive Analytics of Very Large Informatics Graphs (2017) (0)
Web Information Retrieval and Linear Algebra Algorithms, 11.02. - 16.02.2007 (2007) (0)
AVOXI Launches AVOXI Genius - A Cloud Contact Center Platform (2019) (0)
AVOXI Continues Global Expansion With New Virtual Number Coverage in Asia-Pacific (2020) (0)
D S ] 11 J ul 2 00 7 Sampling Algorithms and Coresets for l p Regression (2008) (0)
Meeting: Algorithms for Modern Massive Data Sets (2008) (0)
Low-Rank and Temporal Smoothness Regularization on Value-Based Deep Reinforcement Learning (2022) (0)
SNPsand interpopulation genotype reconstruction from tagging (2007) (0)
A Spectral Algorithm with Applications to Exploring Data Graphs Locally (2010) (0)
Pre-surgical evaluation of mandibular third molars using computed tomography imaging and cone beam volumetric tomography imaging (2007) (0)
Dynamic R Markdown Document Generation [R package heddlr version 0.6.0] (2020) (0)
TIP5P and the reproduction on the density anomaly by simple potential functions (2000) (0)
Principles and Applications of Science of Information (2017) (0)
Clarke, Marieke, Mambo Hills: Historical and Religious Significance , with an introduction by Pathisa Nyathi, Bulawayo, ‘amaBooks, 2008, viii + 28 pp., map, bibliography, 978-0-7974-3589-6. (2010) (0)
FLAG: Fast Linearly-Coupled Adaptive Gradient Method (2016) (0)
Lecture 25 : Element-wise Sampling of Graphs and Linear Equation Solving , Cont (2015) (0)
Implant pour arthrodèse (2007) (0)
Bozzoli Belinda. Theatres of Struggle and the End of Apartheid . Athens: Ohio University Press/Oxford: James Currey Publishers, 2004. xvi + 326 pp. Photographs. Bibliography. Index. $28.95. Paper. (2005) (0)
The Top 3 Toll Free Forwarding Providers // Best Toll Free Forwarding (2019) (0)
FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods (2016) (0)
Fixation stable en torsion (2007) (0)
07071 Abstracts Collection -- Web Information Retrieval and Linear Algebra Algorithms (2007) (0)
Sparse Random Structures : Analysis and Computation January 24 – 29 , 2010 MEALS (2010) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Michael W. M. Mahoney?

Michael W. M. Mahoney is affiliated with the following schools:

Michael W. M. Mahoney's Academic­Influence.com Rankings

Michael W. M. Mahoney's Degrees

Similar Degrees You Can Earn

Why Is Michael W. M. Mahoney Influential?

Michael W. M. Mahoney's Published Works

Published Works

What Schools Are Affiliated With Michael W. M. Mahoney?

Michael W. M. Mahoney's AcademicInfluence.com Rankings