Johannes Gehrke
#10,970
Most Influential Person Now
German computer scientist
Johannes Gehrke's AcademicInfluence.com Rankings
Johannes Gehrkecomputer-science Degrees
Computer Science
#573
World Rank
#593
Historical Rank
Database
#48
World Rank
#50
Historical Rank
Download Badge
Computer Science
Johannes Gehrke's Degrees
- Bachelors Computer Science Karlsruhe Institute of Technology
Similar Degrees You Can Earn
Why Is Johannes Gehrke Influential?
(Suggest an Edit or Addition)According to Wikipedia, Johannes Gehrke is a German computer scientist and the director of Microsoft Research in Redmond and CTO and Head of Machine Learning for the Microsoft Teams Backend. He is an ACM Fellow, an IEEE Fellow, and the recipient of the 2011 IEEE Computer Society Technical Achievement Award. From 1999 to 2015, he was a faculty member in the Department of Computer Science at Cornell University, where at the time of his leaving he was the Tisch University Professor of Computer Science.
Johannes Gehrke's Published Works
Published Works
- L-diversity: privacy beyond k-anonymity (2006) (5054)
- Automatic subspace clustering of high dimensional data for data mining applications (1998) (2787)
- The cougar approach to in-network query processing in sensor networks (2002) (1583)
- Gossip-based computation of aggregate information (2003) (1553)
- Sequential PAttern mining using a bitmap representation (2002) (1223)
- Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission (2015) (1175)
- Privacy preserving mining of association rules (2002) (1080)
- Query processing in sensor networks (2004) (1054)
- Big data and its technical challenges (2014) (946)
- Towards Sensor Database Systems (2001) (914)
- Detecting Change in Data Streams (2004) (874)
- Limiting privacy breaches in privacy preserving data mining (2003) (874)
- MAFIA: a maximal frequent itemset algorithm for transactional databases (2001) (844)
- CACTUS—clustering categorical data using summaries (1999) (583)
- Privacy: Theory meets Practice on the Map (2008) (560)
- Querying the physical world (2000) (558)
- Differential privacy via wavelet transforms (2009) (495)
- Cayuga: A General Purpose Event Monitoring System (2007) (418)
- RainForest—A Framework for Fast Decision Tree Construction of Large Datasets (1998) (406)
- A proportional share resource allocation algorithm for real-time, time-shared systems (1996) (403)
- Automatic Subspace Clustering of High Dimensional Data (2005) (387)
- Intelligible models for classification and regression (2012) (371)
- Processing complex aggregate queries over data streams (2002) (354)
- BOAT—optimistic decision tree construction (1999) (348)
- On computing correlated aggregates over continual data streams (2001) (342)
- Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary (2012) (334)
- Accurate intelligible models with pairwise interactions (2013) (333)
- Injecting utility into anonymized datasets (2006) (332)
- Worst-Case Background Knowledge for Privacy-Preserving Data Publishing (2007) (315)
- Approximate join processing over data streams (2003) (309)
- Towards Expressive Publish/Subscribe Systems (2006) (305)
- MAFIA: a maximal frequent itemset algorithm (2005) (304)
- Querying and mining data streams: you only get one look a tutorial (2002) (299)
- Overview of the 2003 KDD Cup (2003) (287)
- Fast scheduling of periodic tasks on multiple resources (1995) (246)
- The Claremont report on database research (2008) (238)
- Cayuga: a high-performance event processing engine (2007) (237)
- DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints (2002) (211)
- Towards a streaming SQL standard (2008) (192)
- Clustering large datasets in arbitrary metric spaces (1999) (187)
- DEMON: mining and monitoring evolving data (2000) (186)
- Data Mining with Decision Trees (2000) (182)
- Querying peer-to-peer networks using P-trees (2004) (178)
- Asynchronous Large-Scale Graph Processing Made Easy (2013) (176)
- Mining Very Large Databases (1999) (168)
- iReduct: differential privacy with reduced relative errors (2011) (166)
- Multi-query Optimization for Sensor Networks (2005) (166)
- Mining data streams under block evolution (2002) (157)
- ALEX: An Updatable Adaptive Learned Index (2019) (155)
- The Cougar Project: a work-in-progress report (2003) (150)
- A framework for measuring changes in data characteristics (1999) (149)
- Data Stream Management: Processing High-Speed Data Streams (Data-Centric Systems and Applications) (2019) (141)
- Query optimization in compressed database systems (2001) (141)
- Learning State Representations for Query Optimization with Deep Reinforcement Learning (2018) (136)
- The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results (2020) (130)
- The Beckman Report on Database Research (2014) (129)
- Online scheduling to minimize average stretch (1999) (127)
- Towards Statistical Queries over Distributed Private User Data (2012) (118)
- Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy (2011) (118)
- MaskIt: privately releasing user context streams for personalized mobile applications (2012) (117)
- Publishing Search Logs—A Comparative Study of Privacy Guarantees (2012) (104)
- The BUCKY object-relational benchmark (1997) (92)
- Crowd-Blending Privacy (2012) (91)
- GADT: a probability space ADT for representing and querying the physical world (2002) (90)
- SECRET: a scalable linear regression tree algorithm (2002) (89)
- Plagiarism Detection in arXiv (2006) (84)
- P-ring: an efficient and robust P2P range index structure (2007) (83)
- Database management systems (3. ed.) (2003) (80)
- Distributed event stream processing with non-deterministic finite automata (2009) (80)
- Non-intrusive Speech Quality Assessment Using Neural Networks (2019) (77)
- COUGAR: the network is the database (2002) (77)
- Hilda: A High-Level Language for Data-DrivenWeb Applications (2006) (77)
- A scalable noisy speech dataset and online subjective test framework (2019) (76)
- The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework (2020) (76)
- A theoretical framework for learning from a pool of disparate data sources (2002) (74)
- Behavioral simulations in MapReduce (2010) (72)
- Data Publishing against Realistic Adversaries (2009) (69)
- WaveScheduling: energy-efficient data dissemination for sensor networks (2004) (67)
- Fast Iterative Graph Computation with Block Updates (2013) (67)
- Scaling games to epic proportions (2007) (67)
- On the efficiency of checking perfect privacy (2006) (65)
- The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis (2014) (64)
- Fast checkpoint recovery algorithms for frequently consistent applications (2011) (64)
- Non-tracking web analytics (2012) (63)
- Scaling mining algorithms to large databases (2002) (58)
- A unified platform for data driven web applications with automatic client-server partitioning (2007) (57)
- Rule-based multi-query optimization (2009) (56)
- What is "next" in event processing? (2007) (55)
- Least expected cost query optimization: what can we expect? (2002) (55)
- Worst-Case Background Knowledge in Privacy (2006) (55)
- The Claremont report on database research (2009) (54)
- Massively multi-query join processing in publish/subscribe systems (2007) (53)
- Identifying Temporal Patterns and Key Players in Document Collections (1995) (52)
- Bias Correction in Classification Tree Construction (2001) (51)
- Hybrid Push-Pull Query Processing for Sensor Networks (2004) (50)
- Database research opportunities in computer games (2007) (49)
- Sketch-Based Multi-Query Processing over Data Streams (2004) (49)
- Sparse Partially Linear Additive Models (2014) (48)
- An Experimental Analysis of Iterated Spatial Joins in Main Memory (2013) (47)
- A Framework for Measuring Differences in Data Characteristics (2002) (47)
- Challenges and Opportunities with Big Data 2011-1 (2011) (47)
- Interactive anonymization of sensitive data (2009) (46)
- Guest Editors' Introduction: Sensor-Network Applications (2006) (46)
- Scalability for Virtual Worlds (2009) (46)
- P-tree: a p2p index for resource discovery applications (2004) (45)
- Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering (2018) (45)
- Index Structures for Matching XML Twigs Using Relational Query Processors (2005) (45)
- MatchMiner: Efficient Spanning Structure Mining in Large Image Collections (2012) (45)
- ATLAS: a probabilistic algorithm for high dimensional similarity search (2011) (43)
- Qd-tree: Learning Data Layouts for Big Data Analytics (2020) (43)
- Data Stream Management (2016) (42)
- Bias in OLAP Queries: Detection, Explanation, and Removal (2018) (42)
- The Beckman report on database research (2016) (41)
- An Empirical Analysis of Deep Learning for Cardinality Estimation (2019) (40)
- MAFIA: A Performance Study of Mining Maximal Frequent Itemsets (2003) (39)
- Data Modeling (2008) (39)
- Intrusive and Non-Intrusive Perceptual Speech Quality Assessment Using a Convolutional Neural Network (2019) (38)
- Edge-Weighted Personalized PageRank: Breaking A Decade-Old Performance Barrier (2015) (37)
- Veritas: Shared Verifiable Databases and Tables in the Cloud (2019) (36)
- Efficient Approximation of Correlated Sums on Data Streams (2003) (35)
- Explainable security for relational databases (2014) (35)
- ClouDiA: A Deployment Advisor for Public Clouds (2012) (34)
- Privacy in Search Logs (2009) (34)
- Guaranteeing correctness and availability in P2P range indices (2005) (33)
- Query Processing in a Device Database System (1999) (33)
- How to quickly find a witness (2003) (32)
- Guardat: enforcing data policies at the storage layer (2015) (31)
- Fair On-Line Scheduling of a Dynamic Set of Tasks on a Single Resource (1997) (30)
- Query Workloads for Data Series Indexes (2015) (30)
- Centiman: elastic, high performance optimistic concurrency control by watermarking (2015) (29)
- Approximation techniques for spatial data (2004) (29)
- Making time-stepped applications tick in the cloud (2011) (29)
- A General Algebra and Implementation for Monitoring Event Streams (2005) (28)
- Blotter: Low Latency Transactions for Geo-Replicated Storage (2017) (26)
- Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (2004) (25)
- Semantic approximation of data stream joins (2005) (25)
- Better Scripts, Better Games (2008) (24)
- ECRYPT Stream Cipher Project (2011) (24)
- Cuttlefish: A Lightweight Primitive for Adaptive Query Processing (2018) (23)
- An Evaluation of Checkpoint Recovery for Massively Multiplayer Online Games (2009) (23)
- Generating data series query workloads (2018) (22)
- Challenges and Opportunities with Big Data 2012-2 (2011) (22)
- Models and Methods for Privacy-Preserving Data Analysis and Publishing (2006) (22)
- Connectionist Model (2009) (22)
- Wave scheduling and routing in sensor networks (2007) (21)
- The BUCKY Object-Relational Benchmark (Experience Paper) (1997) (21)
- P-Ring: An Index Structure for Peer-to-Peer Systems (2004) (21)
- SEMMO: a scalable engine for massively multiplayer online games (2008) (19)
- Declarative processing for computer games (2008) (18)
- Search in social networks with access control (2010) (18)
- Database Management Systems, -3/E. (2014) (18)
- Entangled queries: Enabling declarative data-driven coordination (2012) (18)
- Correlation Clustering (2009) (17)
- Geo-Replication: Fast If Possible, Consistent If Necessary (2016) (17)
- Workload-aware indexing for keyword search in social networks (2011) (17)
- SLAOrchestrator: Reducing the Cost of Performance SLAs for Cloud Data Analytics (2018) (16)
- Load Balancing and Range Queries in P2P Systems Using P-Ring (2011) (16)
- Inverted indexes vs. bitmap indexes in decision support systems (2009) (16)
- Better scripts, better games (2008) (15)
- An indexing framework for peer-to-peer systems (2004) (15)
- Beyond isolation: research opportunities in declarative data-driven coordination (2010) (14)
- HypDB: A Demonstration of Detecting, Explaining and Resolving Bias in OLAP queries (2018) (14)
- SAFE extensibility of data-driven web applications (2012) (14)
- Conditional Routing (2009) (14)
- Letter from the Special Issue Editor (2003) (13)
- Time management for new faculty (2003) (13)
- Rapid Convergence of a Local Load Balancing Algorithm for Asynchronous Rings (1997) (13)
- Quo vadis, data privacy? (2012) (13)
- Fine-grained disclosure control for app ecosystems (2013) (13)
- Instance-Optimized Data Layouts for Cloud Analytics Workloads (2021) (12)
- Beyond myopic inference in big data pipelines (2013) (11)
- SGL: a scalable language for data-driven games (2008) (11)
- Toward Expressive and Scalable Sponsored Search Auctions (2008) (11)
- Reinforcement learning for bandwidth estimation and congestion control in real-time communications (2019) (10)
- Multi-query optimization for sketch-based estimation (2009) (10)
- Database research in computer games (2009) (10)
- Writes that Fall in the Forest and Make no Sound: Semantics-Based Adaptive Data Consistency (2014) (10)
- Network scheduling for data archiving applications in sensor networks (2006) (10)
- iBox: Internet in a Box (2020) (10)
- A Vision for PetaByte Data Management and Analyis Services for the Arecibo Telescope (2004) (9)
- Automatic client-server partitioning of data-driven web applications (2006) (9)
- From Declarative Languages to Declarative Processing in Computer Games (2009) (9)
- Privacy in data publishing (2010) (9)
- Classification and regression: money *can* grow on trees (1999) (8)
- Rich Media (2009) (8)
- Data Stream Management: A Brave New World (2016) (8)
- Dynamic Quality of Service Resource Management for Multimedia Applications on General Purpose Operating Systems (1997) (8)
- Meeting Effectiveness and Inclusiveness in Remote Collaboration (2021) (8)
- BOAT-Optimistic Decision Tree Construction (1999) (8)
- Programming with differential privacy (2010) (8)
- Cayuga : A High-Performance Event Processing Engine ∗ [ Demonstration Paper ] (2007) (7)
- Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), University of Vienna, Austria, September 23-27, 2007 (2007) (7)
- Report on the SIGKDD 2001 conference panel "New Research Directions in KDD" (2002) (7)
- Indexing for function approximation (2006) (7)
- Models and methods for privacy-preserving data publishing and analysis: invited tutorial (2005) (7)
- Efficient, Consistent Distributed Computation with Predictive Treaties (2019) (7)
- Proceedings of the ACM Symposium on Cloud Computing (2014) (7)
- Directions in Multi−Query Optimization for Sensor Networks (2005) (7)
- Exploring the inherent technical challenges in realizing the potential of Big Data (2014) (6)
- Three Case Studies of Large-Scale Data Flows (2006) (6)
- Large-scale collaborative analysis and extraction of web data (2008) (6)
- High-Speed Function Approximation (2007) (5)
- Content-based Retrieval (2009) (5)
- Leveraging non-uniform resources for parallel query processing (2003) (5)
- Enabling Lightweight Transactions with Precision Time (2017) (5)
- BRRL: a recovery library for main-memory applications in the cloud (2011) (5)
- The Architecture of the Cornell Knowledge Broker (2004) (5)
- Conclusions and Looking Forward (2016) (5)
- Guest Editorial to the special issue on data stream processing (2004) (4)
- The Complexity of Social Coordination (2012) (4)
- Lumos: A Library for Diagnosing Metric Regressions in Web-Scale Applications (2020) (4)
- Coordination through querying in the youtopia system (2011) (4)
- Technical perspectiveData stream processing: when you only get one look (2009) (4)
- PARAMETER EXPLORATION FOR SYNTHETIC DATA WITH PRIVACY GUARANTEES FOR OnTheMap (2009) (4)
- DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems (2021) (4)
- Supervised Classifiers for Audio Impairments with Noisy Labels (2019) (4)
- Pricing Queries Approximately Optimally (2015) (4)
- User-centric personalized extensibility for data-driven web applications (2007) (4)
- Conceptual Data Model (2009) (4)
- Hashtag Recommendation for Enterprise Applications (2016) (3)
- The Claremont Report on Database (3)
- Scalable Decision Tree Construction (2009) (3)
- A storage and indexing framework for p2p systems (2004) (3)
- Energy-Efficient Data Management For Sensor Networks : A WorkIn-Progress Report (2003) (3)
- Building Compressed Database Systems (2002) (3)
- Query Processing with Heterogeneous Resources (2000) (3)
- READY: Completeness is in the Eye of the Beholder (2017) (3)
- Entangled transactions (2011) (3)
- Entangled queries: enabling declarative data-driven coordination (2011) (3)
- Declarative, Domain-Specific Languages - Elegant Simplicity or a Hammer in Search of a Nail? (2008) (2)
- FastVer: Making Data Integrity a Commodity (2021) (2)
- VLDB Panel Summary (2021) (2)
- Computer Human Interaction (CHI) (2009) (2)
- Report on the workshop on research issues in data mining and knowledge discovery workshop (DMKD 2001) (2001) (2)
- Classification and Regression Trees (2009) (2)
- Trusted CVS (2006) (2)
- DSB (2021) (2)
- HypDB: Detect, Explain And Resolve Bias in OLAP (2018) (2)
- HypDB (2018) (2)
- Programming by Rewards (2020) (2)
- Raster Data Management (2009) (1)
- Technical Perspective: Naiad (2016) (1)
- Big Data Pipelines (2013) (1)
- Technical Perspective Programming with Differential Privacy (2010) (1)
- A Confluence of Column Stores and Search Engines: Opportunities and Challenges (2009) (1)
- DEMO: Secure and customizable web development in the safe activation framework (2013) (1)
- Secret-Key Encryption (2009) (1)
- Performance Evaluation on State of the Art Sequential Pattern Mining Algorithms (2015) (1)
- Advances in decision tree construction (2001) (1)
- Database Systems 2.0 (2019) (1)
- Resonance: Replacing Software Constants with Context-Aware Models in Real-time Communication (2020) (1)
- Reverse Nearest Neighbor Search (2009) (1)
- Multi-version Indexing in Flash-based Key-Value Stores (2019) (1)
- Cache Performance (2009) (1)
- Nerio: Leader Election and Edict Ordering (2011) (1)
- Event and Pattern Detection over Streams (2009) (1)
- Cooperative Content Distribution (2009) (0)
- Achieving Low Latency Transactions for Geo-replicated Storage with Blotter (2019) (0)
- Proceedings of the Sixth SIAM International Conference on Data Mining: Preface (2006) (0)
- ) ( a ) Carriers Delay by Airport : ( Simpson ’ s Paradox ) HypDB : Biased Query Query Answers : ( d ) Explanations for Bias : ( c ) (2018) (0)
- Scalable Technology for a New Generation of Collaborative Applications (2007) (0)
- Edge-Weighted Personalized PageRank (2015) (0)
- Jaguar: Extending the Predator Database System with JAVA (2001) (0)
- Explainable Security for Relational Databases (Extended Experimental Evaluation) (2014) (0)
- Current Time (2009) (0)
- Lightweight Inter-transaction Caching with Precise Clocks and Dynamic Self-invalidation (2020) (0)
- Processing High-Speed Intelligence Feeds in Real-Time (2005) (0)
- Recovery Manager (2009) (0)
- Reliable, Rapid, Accurate Banking Transactions using e-Bank (2017) (0)
- Decentralized Data Integration System (2009) (0)
- Special issue: best papers of VLDB 2007 (2009) (0)
- Continuous Queries in Sensor Networks (2009) (0)
- Bias in OLAPQueries : Detection , Explanation , and Removal ( Or Think Twice About Your AVG-Query ) (2018) (0)
- Service Choreography (2009) (0)
- Event Connection (2009) (0)
- A method for evaluation of XML path queries based on index structures and relational query processors. (2005) (0)
- Message from the ICDE 2015 Program Committee and general chairs (2015) (0)
- Declarative data-driven coordination (2011) (0)
- Session details: Keynote address 3 (2011) (0)
- Scalable Winner Determination in Advertising Auctions (2007) (0)
- Degrees of Consistency (2009) (0)
- Scalable Simulations of Dynamics of Relationships (2009) (0)
- a deployment advisor for public clouds (2015) (0)
- Guest Editorial: Special Section on the International Conference on Data Engineering (2014) (0)
- Cross-media Information Retrieval (2009) (0)
- Stateful PublishSubscribe for XML Data Streams Final Performance Report 2006-2009 PI : (2009) (0)
- Database Techniques to Improve Scientific Simulations (2009) (0)
- iBox (2020) (0)
- SENSOR : Data-Driven Sensor Networks Final Report ( September 2003 to August 2006 ) : Activities and Findings (2007) (0)
- DBMS Interface (2009) (0)
- Storage Array (2009) (0)
- Flexible Decision Support in Device-Saturated Environments (2003) (0)
- An Intuition of the Necessitate of Column-Oriented Database Systems (2017) (0)
- Technical Perspective (2020) (0)
- Analyzing Data Streams in Scientific Applications (2009) (0)
- Playing games with databases (2011) (0)
- Special Section on the International Conference on Data Engineering 2015 (2017) (0)
- Letter from the TCDE Awards Committee (2019) (0)
- a deployment advisor for public clouds ClouDiA: a deployment advisor for public clouds (2015) (0)
- ETL Process (2009) (0)
- Parallelizing Data-Centric Programs (2013) (0)
- Latency-Optimized Checkpoint Recovery Algorithms for Massively Multiplayer Online Games (2010) (0)
- Stateful Publish-Subscribe for XML Data Streams (2009) (0)
- Lumos (2020) (0)
- Smarter , more powerful scripting languages will improve game performance while making gameplay development more efficient (0)
- Further study on the development of improved interconnection techniques for silicon solar-cell arrays. Quarterly report, 1 May--31 July 1969 (1970) (0)
- ER Model (2009) (0)
- Coordination through Querying in the Youtopia System [ Demonstration paper ] (2011) (0)
- Conflict Serializability (2009) (0)
- Balancing Isolation and Sharing of Data in Third-Party Extensible App Ecosystems (2014) (0)
- Query Processing with Heterogeneous Resources (Technical Report) (2000) (0)
- Statistical Disclosure Control (SDC) (2009) (0)
- SQL: Queries, Programming, Triggers (2006) (0)
- An Indexing Framework for Structured P2P Systems (2005) (0)
- Developing, Optimizing and Hosting Data-driven Web Applications (2008) (0)
- DBMS Component (2009) (0)
- Randomization Methods to Ensure Data Privacy (2009) (0)
- Service Front-End PSLAManager System Model PerfEnforce Query Scheduling Cluster Provisioning Data Ingest (2018) (0)
- A Quantitative Evaluation Framework for Missing Value Imputation Algorithms (2013) (0)
This paper list is powered by the following services:
Other Resources About Johannes Gehrke
What Schools Are Affiliated With Johannes Gehrke?
Johannes Gehrke is affiliated with the following schools: