Ohad Shamir
#161,722
Most Influential Person Now
Ohad Shamir's AcademicInfluence.com Rankings
Ohad Shamircomputer-science Degrees
Computer Science
#9362
World Rank
#9835
Historical Rank
Theoretical Computer Science
#109
World Rank
#109
Historical Rank
Machine Learning
#4075
World Rank
#4124
Historical Rank
Database
#6329
World Rank
#6561
Historical Rank

Download Badge
Computer Science
Why Is Ohad Shamir Influential?
(Suggest an Edit or Addition)Ohad Shamir's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization (2011) (647)
- The Power of Depth for Feedforward Neural Networks (2015) (627)
- Optimal Distributed Online Prediction Using Mini-Batches (2010) (597)
- Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes (2012) (482)
- Communication-Efficient Distributed Optimization using an Approximate Newton-type Method (2013) (441)
- On the Computational Efficiency of Training Neural Networks (2014) (433)
- Learnability, Stability and Uniform Convergence (2010) (394)
- Size-Independent Sample Complexity of Neural Networks (2017) (387)
- Better Mini-Batch Algorithms via Accelerated Gradient Methods (2011) (285)
- Stochastic Convex Optimization (2009) (279)
- Adaptively Learning the Crowd Kernel (2011) (251)
- Spurious Local Minima are Common in Two-Layer ReLU Neural Networks (2017) (230)
- From Bandits to Experts: On the Value of Side-Observations (2011) (186)
- Learning and generalization with the information bottleneck (2008) (181)
- Communication Complexity of Distributed Convex Learning and Optimization (2015) (177)
- Proving the Lottery Ticket Hypothesis: Pruning is All You Need (2020) (171)
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback (2015) (168)
- Vox Populi: Collecting High-Quality Labels from a Crowd (2009) (164)
- On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization (2012) (162)
- Learning to classify with missing and corrupted features (2008) (161)
- Is Local SGD Better than Minibatch SGD? (2020) (155)
- Large-Scale Convex Minimization with a Low-Rank Constraint (2011) (154)
- Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks (2016) (153)
- Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression (2011) (152)
- A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate (2014) (151)
- Failures of Gradient-Based Deep Learning (2017) (150)
- On the Power and Limitations of Random Features for Understanding Neural Networks (2019) (149)
- Distributed stochastic optimization and learning (2014) (134)
- Multi-Player Bandits - a Musical Chairs Approach (2015) (130)
- Online Learning for Time Series Prediction (2013) (130)
- On the Quality of the Initial Basin in Overspecified Neural Networks (2015) (120)
- Distribution-Specific Hardness of Learning Neural Networks (2016) (105)
- Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation (2013) (103)
- Good learners for evil teachers (2009) (103)
- Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback (2014) (100)
- Online Learning with Switching Costs and Other Adaptive Adversaries (2013) (99)
- Without-Replacement Sampling for Stochastic Gradient Methods (2016) (98)
- Global Non-convex Optimization with Discretized Diffusions (2018) (91)
- On-demand, Spot, or Both: Dynamic Resource Allocation for Executing Batch Jobs in the Cloud (2014) (90)
- Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity (2015) (88)
- Trading Value and Information in MDPs (2012) (86)
- Efficient Learning with Partially Observed Attributes (2010) (86)
- Oracle complexity of second-order methods for smooth convex optimization (2017) (79)
- Probabilistic Label Trees for Efficient Large Scale Image Classification (2013) (78)
- Convergence of Stochastic Gradient Descent for PCA (2015) (77)
- Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity (2009) (72)
- Relax and Randomize : From Value to Algorithms (2012) (72)
- Multiclass-Multilabel Classification with More Classes than Examples (2010) (67)
- Optimal Distributed Online Prediction (2011) (67)
- Cluster Stability for Finite Samples (2007) (65)
- Learning with the weighted trace-norm under arbitrary sampling distributions (2011) (59)
- Spectral Clustering on a Budget (2011) (55)
- Unified Algorithms for Online Learning and Competitive Analysis (2012) (52)
- Learning Kernel-Based Halfspaces with the 0-1 Loss (2011) (51)
- High-resolution microbial community reconstruction by integrating short reads from multiple 16S rRNA regions (2013) (49)
- Are ResNets Provably Better than Linear Predictors? (2018) (49)
- How Good is SGD with Random Shuffling? (2019) (49)
- Online Learning of Noisy Data (2011) (44)
- The Complexity of Finding Stationary Points with Stochastic Gradient Descent (2019) (42)
- Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks (2018) (42)
- Stability and model selection in k-means clustering (2010) (41)
- Learning a Single Neuron with Gradient Methods (2020) (41)
- Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing (2011) (39)
- The Complexity of Making the Gradient Small in Stochastic Convex Optimization (2019) (38)
- Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis (2017) (37)
- On the Iteration Complexity of Oblivious First-Order Optimization Algorithms (2016) (36)
- Model Selection and Stability in k-means Clustering (2008) (36)
- Learning from Weak Teachers (2012) (35)
- Matrix completion with the trace norm: learning, bounding, and transducing (2014) (35)
- A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates (2018) (34)
- On Lower and Upper Bounds in Smooth and Strongly Convex Optimization (2016) (34)
- An Algorithm for Training Polynomial Networks (2013) (34)
- The sample complexity of learning linear predictors with the squared loss (2014) (33)
- Depth Separation in ReLU Networks for Approximating Smooth Non-Linear Functions (2016) (32)
- Using More Data to Speed-up Training Time (2011) (32)
- On the Complexity of Learning with Kernels (2014) (31)
- Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization (2016) (30)
- Detecting Correlations with Little Memory and Communication (2018) (28)
- A Variant of Azuma's Inequality for Martingales with Subgaussian Tails (2011) (28)
- Dimension-Free Iteration Complexity of Finite Sum Optimization Problems (2016) (27)
- On the Reliability of Clustering Stability in the Large Sample Regime (2008) (26)
- Implicit Regularization in ReLU Networks with the Square Loss (2020) (26)
- Depth Separations in Neural Networks: What is Actually Being Separated? (2019) (24)
- Learnability and Stability in the General Learning Setting (2009) (23)
- On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems (2015) (23)
- Decoupling Exploration and Exploitation in Multi-Armed Bandits (2012) (23)
- Oracle Complexity of Second-Order Methods for Finite-Sum Problems (2016) (23)
- A Provably Efficient Algorithm for Training Deep Networks (2013) (22)
- The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks (2020) (22)
- Learning Kernel-Based Halfspaces with the Zero-One Loss (2010) (22)
- Efficient Online Learning via Randomized Rounding (2011) (21)
- Gradient Methods Never Overfit On Separable Data (2020) (21)
- The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication (2021) (21)
- Oracle Complexity in Nonsmooth Nonconvex Optimization (2021) (20)
- Failures of Deep Learning (2017) (20)
- Open Problem: Is Averaging Needed for Strongly Convex Stochastic Gradient Descent? (2012) (19)
- Implicit Regularization Towards Rank Minimization in ReLU Networks (2022) (19)
- Online Learning of Noisy Data with Kernels (2010) (18)
- Reconstructing Training Data from Trained Neural Networks (2022) (16)
- The Implicit Bias of Benign Overfitting (2022) (15)
- Space lower bounds for linear prediction in the streaming model (2019) (15)
- Neural Networks with Small Weights and Depth-Separation Barriers (2020) (14)
- Attribute Efficient Linear Regression with Distribution-Dependent Sampling (2015) (14)
- On the Complexity of Bandit Linear Optimization (2014) (13)
- Learning Linear and Kernel Predictors with the 0-1 Loss Function (2011) (12)
- On Margin Maximization in Linear and ReLU Networks (2021) (12)
- Weight Sharing is Crucial to Succesful Optimization (2017) (12)
- Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex Functions? (2020) (12)
- There's a Hole in My Data Space: Piecewise Predictors for Heterogeneous Learning Problems (2012) (11)
- On the Optimal Memorization Power of ReLU Neural Networks (2021) (11)
- Gradient Methods Provably Converge to Non-Robust Networks (2022) (11)
- Size and Depth Separation in Approximating Benign Functions with Neural Networks (2021) (10)
- The Connection Between Approximation, Depth Separation and Learnability in Neural Networks (2021) (10)
- Online Learning with Local Permutations and Delayed Feedback (2017) (10)
- Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems (2021) (9)
- Relax and Localize: From Value to Algorithms (2012) (9)
- Bandit Regret Scaling with the Effective Loss Range (2017) (8)
- High-Order Oracle Complexity of Smooth and Strongly Convex Optimization (2020) (7)
- A Stochastic PCA Algorithm with an Exponential Convergence Rate. (2014) (7)
- Learning a Single Neuron with Bias Using Gradient Descent (2021) (7)
- On the Complexity of Finding Small Subgradients in Nonsmooth Optimization (2022) (6)
- Accurate Profiling of Microbial Communities from Massively Parallel Sequencing Using Convex Optimization (2013) (6)
- Localization and Adaptation in Online Learning (2013) (5)
- Robust Distributed Online Prediction (2010) (5)
- Efficient Transductive Online Learning via Randomized Rounding (2011) (5)
- Size and Depth Separation in Approximating Natural Functions with Neural Networks (2021) (5)
- Multiclass-Multilabel Learning when the Label Set Grows with the Number of Examples (2009) (4)
- Space lower bounds for linear prediction (2019) (4)
- The Sample Complexity of One-Hidden-Layer Neural Networks (2022) (4)
- A Stochastic Newton Algorithm for Distributed Convex Optimization (2021) (4)
- Attribute Efficient Linear Regression with Data-Dependent Sampling (2014) (3)
- Discussion of: “Nonparametric regression using deep neural networks with ReLU activation function” (2020) (3)
- Graph Approximation and Clustering on a Budget (2014) (2)
- L G ] 6 J un 2 01 8 Detecting Correlations with Little Memory and Communication (2018) (2)
- Convergence Results For Q-Learning With Experience Replay (2021) (2)
- Width is Less Important than Depth in ReLU Neural Networks (2022) (1)
- Stochastic Optimization and Learning (2014) (1)
- The Complexity of Improperly Learning Large Margin Halfspaces (2009) (1)
- Oracle complexity of second-order methods for smooth convex optimization (2018) (0)
- Depth Separations in Neural Networks: What is Actually Being Separated? (2021) (0)
- Preface: Conference on Learning Theory (COLT), 2017 (2017) (0)
- Replay For Safety (2021) (0)
- Performance estimation of stochastic first-order methods (2019) (0)
- On stability in statistical machine learning (עם תקציר בעברית ושער נוסף: על יציבות בלמידה חישובית סטטיסטית.) (2010) (0)
- Quantity Makes Quality: Learning with Partial Views (2011) (0)
- Elephant in the Room: Non-Smooth Non-Convex Optimization (2022) (0)
- Learning with Local Permutations and Delayed Feedback (2017) (0)
- Cluster Stability for Finite Samples Supplementary Material (2007) (0)
- Online learning and Blackwell approachability in quitting games (2016) (0)
- Deterministic Nonsmooth Nonconvex Optimization (2023) (0)
- Information constraints in distributed and online learning (2014) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With Ohad Shamir?
Ohad Shamir is affiliated with the following schools: