Ali Ghodsi | Academic Influence

Ali Ghodsi's AcademicInfluence.com Rankings

Ali Ghodsi

Computer Science

#935

World Rank

#970

Historical Rank

Big Data

#5

World Rank

#5

Historical Rank

Algorithms

#424

World Rank

#429

Historical Rank

Machine Learning

#620

World Rank

#627

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Ali Ghodsi's Degrees

Bachelors Computer Science Sharif University of Technology

Similar Degrees You Can Earn

Why Is Ali Ghodsi Influential?

(Suggest an Edit or Addition)

According to Wikipedia, Ali Ghodsi is an Iranian-Swedish AI leader, computer scientist and entrepreneur specializing in distributed systems and big data. He is a co-founder and CEO of Databricks and an adjunct professor at UC Berkeley. Ideas from his academic research in the area of resource management and scheduling and data caching have been applied in popular open source projects such as Apache Mesos, Apache Spark, and Apache Hadoop.

(See a Problem?)

Ali Ghodsi's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center (2011) (1828)
Spark SQL: Relational Data Processing in Spark (2015) (1249)
Apache Spark (2016) (1246)
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types (2011) (1185)
Apache Spark: a unified engine for big data processing (2016) (740)
Effective Straggler Mitigation: Attack of the Clones (2013) (519)
Information-centric networking: seeing the forest for the trees (2011) (387)
Less pain, most of the gain: incrementally deployable ICN (2013) (371)
Automatic dimensionality selection from the scree plot via the use of profile likelihood (2006) (358)
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks (2014) (354)
PACMan: Coordinated Memory Caching for Parallel Jobs (2012) (326)
Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds (2011) (283)
Bolt-on causal consistency (2013) (228)
Highly Available Transactions: Virtues and Limitations (2013) (217)
Naming in content-oriented architectures (2011) (217)
Multi-resource fair queueing for packet processing (2012) (210)
Accelerating the Machine Learning Lifecycle with MLflow (2018) (204)
HTTP as the narrow waist of the future internet (2010) (202)
Coordination Avoidance in Database Systems (2014) (194)
Choosy: max-min fair sharing for datacenter jobs with constraints (2013) (193)
Sentiment analysis based on improved pre-trained word embeddings (2019) (188)
Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry (2018) (181)
Software-defined internet architecture: decoupling architecture from infrastructure (2012) (180)
Dimensionality Reduction A Short Tutorial (2006) (178)
A Berkeley View of Systems Challenges for AI (2017) (171)
Drizzle: Fast and Adaptable Stream Processing at Scale (2017) (171)
Eventual consistency today: limitations, extensions, and beyond (2013) (169)
Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark (2018) (145)
Disk-Locality in Datacenter Computing Considered Irrelevant (2011) (143)
The essence of P2P: a reference architecture for overlay networks (2005) (142)
Scalable Atomic Visibility with RAMP Transactions (2014) (136)
Symmetric Replication for Structured Peer-to-Peer Systems (2005) (130)
HUG: Multi-Resource Fairness for Correlated and Elastic Demands (2016) (122)
Distributed k-ary System: Algorithms for Distributed Hash Tables (2006) (122)
The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox (2014) (112)
Scaling Spark in the Real World: Performance and Usability (2015) (110)
Hierarchical scheduling for diverse datacenter workloads (2013) (110)
Architecting for innovation (2011) (95)
Efficient greedy feature selection for unsupervised learning (2013) (89)
The potential dangers of causal consistency and an explicit solution (2012) (86)
An Efficient Greedy Method for Unsupervised Feature Selection (2011) (84)
CAP for networks (2013) (83)
Eventual Consistency Today: Limitations, Extensions, and Beyond (2013) (76)
Intelligent design enables architectural evolution (2011) (72)
Kernelized Supervised Dictionary Learning (2012) (72)
Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity (2015) (71)
Robust Locally-Linear Controllable Embedding (2017) (70)
SparkR: Scaling R Programs with Spark (2016) (67)
Action respecting embedding (2005) (65)
FairRide: Near-Optimal, Fair Cache Sharing (2016) (61)
Distance metric learning vs. Fisher discriminant analysis (2008) (61)
Greedy column subset selection for large-scale data sets (2013) (59)
Learning the Structure of Sum-Product Networks via an SVD-based Algorithm (2015) (58)
Automatic basis selection techniques for RBF networks (2003) (57)
Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly (2012) (56)
Common Object Request Broker Architecture (2009) (55)
Multiview Supervised Dictionary Learning in Speech Emotion Recognition (2014) (54)
Strategyproof allocation of discrete jobs on multiple machines (2014) (54)
Fine-Tuning and training of densenet for histopathology image representation using TCGA diagnostic slides (2021) (53)
Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis (2017) (51)
Supervised Dictionary Learning and Sparse Representation-A Review (2015) (49)
A Framework for Structured Peer-to-Peer Overlay Networks (2004) (48)
Determining Protein Structures from NOESY Distance Constraints by Semidefinite Programming (2013) (46)
Distributed Column Subset Selection on MapReduce (2013) (44)
Self-Correcting Broadcast in Distributed Hash Tables (2003) (43)
Sparse supervised principal component analysis (SSPCA) for dimension reduction and variable selection (2017) (40)
The Datacenter Needs an Operating System (2011) (39)
Developments in MLflow: A System to Accelerate the Machine Learning Lifecycle (2020) (38)
Subjective Localization with Action Respecting Embedding (2005) (38)
Rare Class Classification by Support Vector Machine (2010) (36)
A Practical Approach to Network Size Estimation for Structured Overlays (2008) (36)
Nonnegative matrix factorization via rank-one downdate (2008) (36)
Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics (2021) (35)
A novel greedy algorithm for Nyström approximation (2011) (35)
Annealing Knowledge Distillation (2021) (35)
HAT, Not CAP: Towards Highly Available Transactions (2013) (32)
Discriminative functional analysis of human movements (2013) (27)
Fast and Scalable Feature Selection for Gene Expression Data Using Hilbert-Schmidt Independence Criterion (2017) (27)
Multicast in DKS (N,k,f )overlay networks (2003) (27)
Tachyon : Memory Throughput I / O for Cluster Computing Frameworks (2013) (27)
Low-Bandwidth Topology Maintenance for Robustness in Structured Overlay Networks (2005) (26)
Handling Network Partitions and Mergers in Structured Overlay Networks (2007) (26)
Key-based consistency and availability in structured overlay networks (2008) (26)
Dictionary Learning in Texture Classification (2011) (25)
SymbolicGPT: A Generative Transformer Model for Symbolic Regression (2021) (25)
Dealing with network partitions in structured overlay networks (2009) (23)
Reliable, Memory Speed Storage for Cluster Computing Frameworks (2014) (23)
Improving Embeddings by Flexible Exploitation of Side Information (2007) (22)
Multidimensional Scaling, Sammon Mapping, and Isomap: Tutorial and Survey (2020) (19)
Guided Locally Linear Embedding (2011) (19)
Dominant Resource Fairness: Fair Allocation of Heterogeneous Resources in Datacenters (2010) (19)
Asynchronous Complex Analytics in a Distributed Dataflow Architecture (2015) (18)
Detecting Change-Points in Time Series by Maximum Mean Discrepancy of Ordinal Pattern Distributions (2012) (18)
Nexus: A Common Substrate for Cluster Computing (2009) (18)
Advances in projection of climate change impacts using supervised nonlinear dimensionality reduction techniques (2017) (18)
DOH: A Content Delivery Peer-to-Peer Network (2006) (18)
Computationally instrument-resolution-independent de novo peptide sequencing for high-resolution devices (2021) (17)
Locally Linear Embedding and its Variants: Tutorial and Survey (2020) (17)
JADE: Joint Autoencoders for Dis-Entanglement (2017) (15)
Coordination-Avoiding Database Systems (2014) (15)
Adapting Component Analysis (2012) (15)
Reproducing Kernel Hilbert Space, Mercer's Theorem, Eigenfunctions, Nyström Method, and Use of Kernels in Machine Learning: Tutorial and Survey (2021) (14)
Fast Spectral Clustering Using Autoencoders and Landmarks (2017) (14)
Simple weakly supervised deep learning pipeline for detecting individual red-attacked trees in VHR remote sensing images (2020) (14)
On Consistency Of Data In Structured Overlay Networks (2008) (14)
Towards robust peer counting (2009) (14)
Personalized workflow to identify optimal T-cell epitopes for peptide-based vaccines against COVID-19 (2020) (13)
A new approach to the numerical solution of Fredholm integral equations using least squares-support vector regression (2021) (13)
DeepNovoV2: Better de novo peptide sequencing with deep learning (2019) (13)
A Fast Greedy Algorithm for Generalized Column Subset Selection (2013) (12)
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation (2021) (12)
Gossiping over storage systems is practical (2007) (12)
Highly Available Transactions: Virtues and Limitations (Extended Version) (2013) (12)
Generative mixture of networks (2017) (11)
Tangent-corrected embedding (2005) (11)
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher (2021) (11)
Estimation of the Memory Parameters of the Fractionally Integrated Separable Spatial Autoregressive (FISSAR(1, 1)) Model: A Simulation Study (2009) (11)
Localization and classification of cell nuclei in post-neoadjuvant breast cancer surgical specimen using fully convolutional networks (2018) (11)
Implementing Dynamic Querying Search in k-ary DHT-based Overlays (2008) (10)
Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation (2021) (10)
Factor Analysis, Probabilistic Principal Component Analysis, Variational Inference, and Variational Autoencoder: Tutorial and Survey (2021) (10)
KKT Conditions, First-Order and Second-Order Optimization, and Distributed Optimization: Tutorial and Survey (2021) (10)
Scalable atomic visibility with RAMP transactions (2014) (10)
Distance Metric Learning Versus Fisher Discriminant Analysis (2008) (10)
Strategyproofness, Leontief Economies and the Kalai-Smorodinsky Solution (2011) (10)
A First-Order Spatial Integer-Valued Autoregressive SINAR(1, 1) Model (2012) (9)
Self Management of Large-Scale Distributed Systems by Combining Peer-to-Peer Networks and Components (2005) (9)
Subjective Mapping (2006) (9)
Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey (2020) (9)
Discriminant component analysis via distance correlation maximization (2020) (8)
Fast Freenet: improving Freenet performance by preferential partition routing and file mesh propagation (2006) (8)
Parallel LS-SVM for the numerical simulation of fractional Volterra’s population model (2021) (7)
Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey (2021) (7)
Regularized Greedy Importance Sampling (2002) (7)
Information-centric networking - Ready for the real world? (Dagstuhl Seminar 12361) (2012) (7)
Exploiting the synergy between gossiping and structured overlays (2007) (7)
Laplacian-Based Dimensionality Reduction Including Spectral Clustering, Laplacian Eigenmap, Locality Preserving Projection, Graph Embedding, and Diffusion Map: Tutorial and Survey (2021) (7)
Stochastic Neighbor Embedding with Gaussian and Student-t Distributions: Tutorial and Survey (2020) (7)
The Survival of the Unfittest (2006) (7)
Discovery Radiomics via a Mixture of Deep ConvNet Sequencers for Multi-parametric MRI Prostate Cancer Classification (2017) (7)
Position Paper : “ Self-” properties in Distributed Kary Structured Overlay Networks ∗ (2004) (6)
Robust locally linear embedding using penalty functions (2011) (6)
A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis (2015) (6)
Supervised discriminative dimensionality reduction by learning multiple transformation operators (2021) (6)
Restricted Boltzmann Machine and Deep Belief Network: Tutorial and Survey (2021) (5)
Protein Structure by Semidefinite Facial Reduction (2012) (5)
Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion (2016) (5)
Making the Internet More Evolvable (2012) (5)
Coordination Avoidance in Database Systems (Extended Version) (2014) (5)
Deep Variational Sufficient Dimensionality Reduction (2018) (5)
Distance Correlation Autoencoder (2018) (5)
Parameter selection for smoothing splines using Stein's Unbiased Risk Estimator (2011) (5)
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding (2021) (5)
Text Classification based on Multiple Block Convolutional Highways (2018) (5)
HTTP: An Evolvable Narrow Waist for the Future Internet (2012) (4)
Asymptotic properties of GPH estimators of the memory parameters of the fractionally integrated separable spatial ARMA (FISSARMA) models (2016) (4)
Spectral, Probabilistic, and Deep Metric Learning: Tutorial and Survey (2022) (4)
Conditional Maximum Likelihood Estimation of the First-Order Spatial Integer-Valued Autoregressive (SINAR(1,1)) Model (2015) (4)
Mesos: Flexible Resource Sharing for the Cloud (2011) (4)
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation (2022) (4)
Semi-Supervised Representation Learning based on Probabilistic Labeling (2016) (4)
Johnson-Lindenstrauss Lemma, Linear and Nonlinear Random Projections, Random Fourier Features, and Random Kitchen Sinks: Tutorial and Survey (2021) (4)
Learning Subjective Representations for Planning (2005) (4)
A Self-stabilizing Network Size Estimation Gossip Algorithm for Peer-to-Peer Systems (2005) (4)
Dealing with Bootstrapping, Maintenance, and Network Partitions and Mergers in Structured Overlay Networks (2012) (4)
How to Select One Among All ? An Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding (2021) (4)
SPROS: An SDP-based protein structure determination from NMR data (2011) (3)
SRP: Efficient class-aware embedding learning for large-scale data via supervised random projections (2018) (3)
Managing Network Partitions in Structured P2P Networks (2010) (3)
DyLoRA: Parameter-Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation (2022) (3)
Multicast and Bulk Lookup in Structured Overlay Networks (2010) (3)
Some Properties of the Normalized Periodogram of a Fractionally Integrated Separable Spatial ARMA (FISSARMA) Model (2013) (3)
RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation (2021) (3)
DKS: Distributed K-Ary System. A Middleware for Building Large Scale Dynamic Distributed Applications (2008) (3)
Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components (2014) (3)
Generative Adversarial Networks and Adversarial Autoencoders: Tutorial and Survey (2021) (3)
Unified Framework for Spectral Dimensionality Reduction, Maximum Variance Unfolding, and Kernel Learning By Semidefinite Programming: Tutorial and Survey (2021) (3)
Transformation-Invariant Embedding for Image Analysis (2004) (3)
Self Management of Large-Scale DistributedSystems by Combining Structured OverlayNetworks and Components (2005) (3)
Symbolically Solving Partial Differential Equations using Deep Learning (2020) (3)
On the Invariance of Dictionary Learning and Sparse Representation to Projecting Data to a Discriminative Space (2015) (2)
Legendre Deep Neural Network (LDNN) and its application for approximation of nonlinear Volterra Fredholm Hammerstein integral equations (2021) (2)
Learning an Affine Transformation for Non-linear Dimensionality Reduction (2010) (2)
Deep Structure for end-to-end inverse rendering (2017) (2)
Nonnegative Matrix Factorization Using Autoencoders And Exponentiated Gradient Descent (2018) (2)
Low Dimensional Localized Clustering (LDLC) (2012) (2)
Scalable Action Respecting Embedding (2008) (2)
Elements of Dimensionality Reduction and Manifold Learning (2023) (2)
Disentangling Dynamics and Content for Control and Planning (2017) (2)
Synthesizing Deep Neural Network Architectures using Biological Synaptic Strength Distributions (2017) (2)
Generative Locally Linear Embedding (2021) (2)
GODS: Global Observatory for Distributed Systems (2007) (2)
Ensembles of Random Projections for Nonlinear Dimensionality Reduction (2017) (2)
Conformal Mapping by Computationally Efficient Methods (2010) (2)
Sufficient Dimension Reduction for High-Dimensional Regression and Low-Dimensional Embedding: Tutorial and Survey (2021) (1)
Efficient greedy feature selection for unsupervised learning (2012) (1)
Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models (2022) (1)
Supervised Texture Classification Using a Novel Compression-Based Similarity Measure (2012) (1)
Efficient parameter selection for system identification (2004) (1)
Theoretical Connection between Locally Linear Embedding, Factor Analysis, and Probabilistic PCA (2022) (1)
Autocovariance Function of the Fractionally Integrated Separable Spatial ARMA (FISSARMA) Models (2015) (1)
Tachyon (2019) (1)
A Symmetric Replication Scheme for Increased Security and Performance in Structured Overlay Networks (2004) (1)
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging (2022) (1)
KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation (2022) (1)
Atomic Ring Maintenance for Distributed Hash Tables (2007) (1)
Greedy Nyström Approximation (2010) (1)
A Neuro-Symbolic Method for Solving Differential and Functional Equations (2020) (1)
GOSSIP : Gossip Over Storage Systems Is Practical (2007) (1)
First-Order Fractionally Integrated Non-Separable Spatial Autoregressive (FINSSAR(1,1)) Model and Some of its Properties (2013) (1)
Improved knowledge distillation by utilizing backward pass knowledge in neural networks (2023) (1)
Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components (2014) (0)
Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry (2018) (0)
Probability Models for the Distribution of Copepods in Different Coastal Ecosystems Along the Straits of Malacca (2012) (0)
MesosResource Sharing for the Cloud (2011) (0)
Segmentation Approach for Coreference Resolution Task (2020) (0)
Knowledge Distillation with Noisy Labels for Natural Language Understanding (2021) (0)
Memento: Coordinated Caching for Data-Intensive Clusters (2011) (0)
Algorithms, Reliability (2007) (0)
Nash Bargaining without Scale Invariance (2011) (0)
Kolmogorov complexity vector: A novel data representation (2015) (0)
ForestCast: a central solution to heuristically constructing trees (2007) (0)
Istc-cc Update Istc-cc Research Overview Pillar 1: Specialization (0)
source computing framework unifies streaming , batch , and interactive big data workloads to unlock new applications (2016) (0)
Do we need Label Regularization to Fine-tune Pre-trained Language Models? (2022) (0)
Spark SQL Resilient Distributed Datasets Spark JDBC Console User Programs ( Java , Scala , Python ) Catalyst Optimizer DataFrame API (2015) (0)
Demonstration of MLflow : A System to Accelerate the Machine Learning Lifecycle (2019) (0)
Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey (2023) (0)
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization (2022) (0)
Properties and Estimation Fractionally Integrated Spatial Models and Non-Negative Integer-Valued Autoregressive Spatial Models (2011) (0)
Visualization in Low Dimensional Space by Tessellation of Linear Subspaces (2010) (0)
The Sky Above The Clouds (2022) (0)
Generative locally linear embedding: A module for manifold unfolding and visualization (2021) (0)
Automatic basis selection for RBF networks using Stein's unbiased risk estimator (2003) (0)
Nonlinear dimensionality reduction with side information (2006) (0)
MyriadStore: Technical Report (2006) (0)
Causal Consistency (2019) (0)
Greedy column subset selection for large-scale data sets (2014) (0)

This paper list is powered by the following services:

Other Resources About Ali Ghodsi

What Schools Are Affiliated With Ali Ghodsi?

Ali Ghodsi is affiliated with the following schools:

Image Attributions

Image Source for Ali Ghodsi