Surajit Chaudhuri
#116,271
Most Influential Person Now
Surajit Chaudhuri's AcademicInfluence.com Rankings
Surajit Chaudhuricomputer-science Degrees
Computer Science
#4546
World Rank
#4796
Historical Rank
Information Technology
#23
World Rank
#24
Historical Rank
Database
#1761
World Rank
#1846
Historical Rank

Download Badge
Computer Science
Surajit Chaudhuri's Degrees
- PhD Computer Science Stanford University
Similar Degrees You Can Earn
Why Is Surajit Chaudhuri Influential?
(Suggest an Edit or Addition)Surajit Chaudhuri's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- An overview of data warehousing and OLAP technology (1997) (2729)
- Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals (1996) (2529)
- DBXplorer: a system for keyword-based search over relational databases (2002) (875)
- An overview of business intelligence technology (2011) (769)
- Automated Selection of Materialized Views and Indexes in SQL Databases (2000) (680)
- An overview of query optimization in relational systems (1998) (647)
- A Primitive Operator for Similarity Joins in Data Cleaning (2006) (615)
- Robust and efficient fuzzy match for online data cleaning (2003) (554)
- Optimizing queries with materialized views (1995) (481)
- Eliminating Fuzzy Duplicates in Data Warehouses (2002) (464)
- An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server (1997) (426)
- On random sampling over joins (1999) (366)
- Maintenance of Materialized Views: Problems, Techniques, and Applications. (1995) (352)
- STHoles: a multidimensional workload-aware histogram (2001) (343)
- Evaluating Top-k Selection Queries (1999) (338)
- Self-Tuning Database Systems: A Decade of Progress (2007) (333)
- Automated Ranking of Database Query Results (2003) (307)
- Self-tuning histograms: building histograms without looking at data (1999) (299)
- AutoAdmin “what-if” index analysis utility (1998) (295)
- Top-k selection queries over relational databases: Mapping strategies and performance evaluation (2002) (287)
- Database tuning advisor for microsoft SQL server 2005: demo (2005) (280)
- Dynamic sample selection for approximate query processing (2003) (280)
- Random sampling for histogram construction: how much is enough? (1998) (280)
- Including Group-By in Query Optimization (1994) (261)
- Towards estimation error guarantees for distinct values (2000) (246)
- InfoGather: entity augmentation and attribute discovery by holistic matching with web tables (2012) (242)
- The Claremont report on database research (2008) (238)
- Optimized stratified sampling for approximate query processing (2007) (235)
- Robust identification of fuzzy duplicates (2005) (231)
- Overview of Data Exploration Techniques (2015) (225)
- Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System (2000) (220)
- Optimization of queries with user-defined predicates (1996) (215)
- Optimization of real conjunctive queries (1993) (199)
- An Online Approach to Physical Design Tuning (2007) (191)
- Probabilistic Ranking of Database Query Results (2004) (180)
- Overcoming limitations of sampling for aggregation queries (2001) (166)
- Extending autocompletion to tolerate errors (2009) (164)
- What next?: a half-dozen data management research goals for big data and the cloud (2012) (162)
- Optimizing top-k selection queries over multimedia repositories (2004) (156)
- Robust Cardinality and Cost Estimation for Skyline Operator (2006) (155)
- Database Technology for Decision Support Systems (2001) (155)
- Probabilistic information retrieval approach for ranking of database query results (2006) (152)
- Automatic physical database tuning: a relaxation-based approach (2005) (150)
- DBXplorer: enabling keyword search over relational databases (2002) (150)
- On the equivalence of recursive and nonrecursive datalog programs (1992) (149)
- Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters (2016) (144)
- Flexible Database Generators (2005) (142)
- Integrating DB and IR Technologies: What is the Sound of One Hand Clapping? (2005) (140)
- Towards a robust query optimizer: a principled and practical approach (2005) (138)
- Estimating progress of execution for SQL queries (2004) (136)
- A robust, optimization-based approach for approximate answering of aggregate queries (2001) (130)
- The Beckman Report on Database Research (2014) (129)
- Approximate Query Processing: No Silver Bullet (2017) (128)
- Exploiting statistics on query expressions for optimization (2002) (126)
- On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases (1998) (125)
- Data Mining and Database Systems: Where is the Intersection? (1998) (123)
- Index selection for databases: a hardness study and a principled heuristic solution (2004) (118)
- Query Optimization in the Presence of Foreign Functions (1993) (116)
- Effective use of block-level sampling in statistics estimation (2004) (110)
- Automatic categorization of query results (2004) (109)
- Robust Estimation of Resource Consumption for SQL Queries using Statistical Techniques (2012) (109)
- Selectivity Estimation for Range Predicates using Lightweight Models (2019) (107)
- An efficient filter for approximate membership checking (2008) (105)
- Example-driven design of efficient record matching queries (2007) (102)
- Discovering queries based on example tuples (2014) (100)
- To tune or not to tune?: a lightweight physical design alerter (2006) (96)
- Transformation-based Framework for Record Matching (2008) (92)
- Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee (2016) (92)
- Automating statistics management for query optimizers (2000) (92)
- ClusterJoin: A Similarity Joins Framework using Map-Reduce (2014) (88)
- Integrating data mining with SQL databases: OLE DB for data mining (2001) (85)
- Fine Grained Authorization Through Predicated Grants (2007) (85)
- Generalization and a framework for query modification (1990) (85)
- Learning String Transformations From Examples (2009) (84)
- Data warehousing and OLAP for decision support (1997) (82)
- Compressing SQL workloads (2002) (82)
- Generating Queries with Cardinality Constraints for DBMS Testing (2006) (79)
- A demonstration of SQLVM: performance isolation in multi-tenant relational database-as-a-service (2013) (75)
- Accelerating Machine Learning Inference with Probabilistic Predicates (2018) (75)
- AI Meets AI: Leveraging Query Executions to Improve Index Recommendations (2019) (73)
- Proceedings : KDD-99 : the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 15-18, 1999, San Diego, California, USA (1999) (72)
- When can we trust progress estimators for SQL queries? (2005) (71)
- Leveraging aggregate constraints for deduplication (2007) (70)
- Optimizing Queries with Aggregate Views (1996) (70)
- Selectivity estimation for string predicates: overcoming the underestimation problem (2004) (69)
- Exploiting web search to generate synonyms for entities (2009) (67)
- S4: Top-k Spreadsheet-Style Search for Query Discovery (2015) (62)
- Interval-based pruning for top-k processing over compressed lists (2011) (62)
- Index merging (1999) (62)
- AutoAdmin: Self-Tuning Database SystemsTechnology (2006) (57)
- Targeted disambiguation of ad-hoc, homogeneous sets of named entities (2012) (55)
- Scalable classification over SQL databases (1999) (54)
- The Claremont report on database research (2009) (54)
- The Seattle Report on Database Research (2020) (54)
- Join queries with external text sources: execution and optimization techniques (1995) (53)
- A framework for robust discovery of entity synonyms (2012) (53)
- Exploiting web search engines to search structured databases (2009) (52)
- Materialized view and index selection tool for Microsoft SQL server 2000 (2001) (52)
- Foundations of Automated Database Tuning (2005) (51)
- Scalable ad-hoc entity extraction from text collections (2008) (51)
- Keyword querying and Ranking in Databases (2009) (51)
- Proceedings of the 11th ACM Symposium on Cloud Computing (2010) (49)
- Rethinking Query Processing for Energy Efficiency: Slowing Down to Win the Race. (2011) (49)
- Data cleaning in microsoft SQL server 2005 (2005) (47)
- Query optimizers: time to rethink the contract? (2009) (47)
- Physical design refinement: The ‘merge-reduce’ approach (2007) (46)
- Mining Document Collections to Facilitate Accurate Approximate Entity Matching (2009) (45)
- Ranking objects based on relationships and fixed associations (2009) (45)
- Automatically Indexing Millions of Databases in Microsoft Azure SQL Database (2019) (44)
- Integration of Data Mining and Relational Databases (2000) (44)
- Constrained physical design tuning (2008) (43)
- Estimating Progress of Long Running SQL Queries (2004) (42)
- Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers (2014) (42)
- Auto-Join: Joining Tables by Leveraging Transformations (2017) (42)
- Database Access Control and Privacy: Is there a common ground? (2011) (42)
- Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations (2018) (41)
- Heavy-tailed distributions and multi-keyword queries (2007) (41)
- The Beckman report on database research (2016) (41)
- Sharing Buffer Pool Memory in Multi-Tenant Relational Database-as-a-Service (2015) (40)
- A pay-as-you-go framework for query execution feedback (2008) (40)
- Efficient evaluation of queries with mining predicates (2002) (40)
- Automating layout of relational databases (2003) (40)
- Power Hints for Query Optimization (2009) (38)
- Integration of Data Mining with Database Technology (2000) (36)
- On relational support for XML publishing: beyond sorting and tagging (2003) (36)
- Optimizing queries over multimedia repositories (1996) (35)
- On the complexity of equivalence between recursive and nonrecursive Datalog programs (1994) (35)
- Microsoft index turning wizard for SQL Server 7.0 (1998) (34)
- An Overview of Cost-based Optimization of Queries with Aggregates (1995) (33)
- Proceedings of the 2006 ACM SIGMOD international conference on Management of data (2006) (32)
- Self-Tuning Technology in Microsoft SQL Server (1999) (32)
- SQLCM: a continuous monitoring framework for relational database engines (2004) (31)
- Variance aware optimization of parameterized queries (2010) (31)
- Incorporating string transformations in record matching (2008) (29)
- Fast Foreign-Key Detection in Microsoft SQL Server PowerPivot for Excel (2014) (29)
- Primitives for Workload Summarization and Implications for SQL (2003) (28)
- Stop-and-Restart Style Execution for Long Running Decision Support Queries (2007) (28)
- Avoiding Retrieval Contention for Composite Multimedia Objects (1998) (27)
- Can Datalog be approximated? (1994) (27)
- Storing XML (with XSD) in SQL databases: interplay of logical and physical designs (2004) (27)
- Efficiently approximating selectivity functions using low overhead regression models (2020) (25)
- Exact Cardinality Query Optimization for Optimizer Testing (2009) (24)
- Operator and Query Progress Estimation in Microsoft SQL Server Live Query Statistics (2016) (23)
- Privacy preservation of aggregates in hidden databases: why and how? (2009) (23)
- A Database Striptease or How to Manage Your Personal Databases (2003) (22)
- Conditional selectivity for statistics on query expressions (2004) (22)
- Data services leveraging Bing's data assets (2016) (21)
- A Statistical Approach Towards Robust Progress Estimation (2011) (20)
- Efficient creation of statistics over query expressions (2003) (20)
- Temporal Relationships in Databases (1988) (20)
- Data Debugger: An Operator-Centric Approach for Data Quality Solutions (2006) (20)
- Microsoft Index Tuning Wizard for SQL Server 7.0 (1998) (20)
- Self-Managing Technology in Database Management Systems (2004) (19)
- Retrieval of Composite Multimedia Objects (1995) (18)
- Bridging the Application and DBMS Profiling Divide for Database Application Developers (2007) (18)
- Diagnosing Estimation Errors in Page Counts Using Execution Feedback (2008) (18)
- Data Warehousing and OLAP for Decision Support (Tutorial) (1997) (18)
- COMPARE (2019) (18)
- How Different is Big Data? (2012) (16)
- Self-Management Technology in Databases (2009) (16)
- AutoAdmin Project at Microsoft Research: Lessons Learned (2011) (16)
- Interactive physical design tuning (2010) (16)
- Factorizing complex predicates in queries to exploit indexes (2003) (16)
- Physical Design Refinement: The "Merge-Reduce" Approach (2006) (15)
- Storage and retrieval of XML data using relational databases (2001) (15)
- On Sampling and Relational Operators (1999) (15)
- Transform-Data-by-Example (TDE): Extensible Data Transformation in Excel (2018) (15)
- Pushing Data-Induced Predicates Through Joins in Big-Data Clusters (2019) (14)
- Finding nonrecursive envelopes for Datalog predicate (1993) (13)
- Towards a Domain Independent Platform for Data Cleaning (2011) (13)
- Distill (2022) (12)
- Experiences with Approximating Queries in Microsoft's Production Big-Data Clusters (2019) (12)
- Plan Stitch: Harnessing the Best of Many Plans (2018) (12)
- Managing Objects in a Relational Framework (1989) (12)
- On Scheduling Atomic and Composite Continuous Media Objects (2002) (11)
- Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches (2018) (11)
- Data Confidentiality (2009) (11)
- Finding Nonrecursive Envelopes for Datalog Predicates. (1992) (11)
- Experiences with using Data Cleaning Technology for Bing Services (2012) (10)
- AutoAdmin 'What-if' Index Analysis Utility (1998) (10)
- Efficient Identification of Approximate Best Configuration of Training in Large Datasets (2018) (10)
- Data services for E-tailers leveraging web search engine assets (2013) (10)
- Online autoadmin: (physical design tuning) (2007) (9)
- Database Tuning using Online Algorithms (2009) (9)
- EntityTagger: automatically tagging entities with descriptive phrases (2011) (9)
- Interactive plan hints for query optimization (2009) (9)
- New frontiers in business intelligence (2011) (9)
- Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples (2021) (9)
- Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 15-18, 1999 (1999) (9)
- TableQnA: Answering List Intent Queries With Web Tables (2020) (8)
- Bridging the application and DBMS divide using static analysis and dynamic profiling (2009) (8)
- Extracting predicates from mining models for efficient query evaluation (2004) (7)
- Bitvector-aware Query Optimization for Decision Support Queries (2020) (7)
- Data Mining and its Role in Database Systems (2000) (7)
- Auto-Transform: Learning-to-Transform by Patterns (2020) (7)
- The Next 5 Years: What Opportunities Should the Database Community Seize to Maximize its Impact? (2020) (7)
- Customizable and Scalable Fuzzy Join for Big Data (2019) (6)
- Cloud databases (2010) (6)
- Technical perspectiveRelational query optimization: data management meets statistical estimation (2009) (6)
- Performance of Multiattribute Top-K Queries on Relational Systems (2000) (6)
- Leveraging Re-costing for Online Optimization of Parameterized Queries with Guarantees (2017) (6)
- Cloud Data Services: Workloads, Architectures and Multi-Tenancy (2021) (6)
- Issues in Network Management in the Next Millennium (1999) (5)
- Database Tuning using Combinatorial Search (2009) (5)
- Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10-11, 2010 (2010) (5)
- KDD-99: the fifth ACM SIGKDD international conference on knowledge discovery and data mining (2000) (5)
- AutoTag 'n Search My Photos: Leveraging the Social Graph for Photo Tagging (2015) (5)
- Data Mining and Knowledge Discovery (1997) (4)
- Service Bus (2009) (4)
- DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems (2021) (4)
- Auto-Pipeline: Synthesize Data Pipelines By-Target Using Reinforcement Learning and Search (2021) (4)
- Reminiscences on influential papers (2005) (4)
- Leveraging Query Logs and Machine Learning for Parametric Query Optimization (2021) (3)
- Bulletin of the Technical Committee on (1999) (3)
- The Claremont Report on Database (3)
- COMPARE: Accelerating Groupwise Comparison in Relational Databases for Data Analytics (2021) (3)
- Petabytes to science (2019) (3)
- Rule profiling for query optimizers and their implications (2010) (3)
- Pre-training Summarization Models of Structured Datasets for Cardinality Estimation (2021) (3)
- Database Application Developer Tools Using Static Analysis and Dynamic Profiling (2014) (3)
- Budget-aware Index Tuning with Reinforcement Learning (2022) (3)
- Correction to 'Automating Statistics Management for Query Optimizers' (2001) (2)
- DataSlicer: Task-Based Data Selection for Visual Data Exploration (2017) (2)
- Database Protection (2009) (2)
- Letter from the Special Issue Editor (1998) (2)
- DSB (2021) (2)
- Optimizing Queries over Multimedia Repositories Bulletin of the Ieee Computer Society Technical Committee on Data Engineering (1996) (2)
- Databases and IR: Perspectives of a SQL Guy (2003) (2)
- Hyperspace: The Indexing Subsystem of Azure Synapse (2021) (2)
- Self-Tuning Database Systems (2002) (2)
- Review - Integrating Mining with Relational Database Systems: Alternatives and Implications (2000) (2)
- Information at your Fingertips: Only a dream for enterprises? (2015) (2)
- The Seattle report on database research (2022) (2)
- Interactive Demonstration of Probabilistic Predicates (2018) (2)
- Database Research: Lead, Follow, or Get Out of the Way? - Panel Abstract (1996) (2)
- Physical Design Refiner The 'Merge-Reduce' Approacii (2007) (2)
- Data Manipulation Language (2009) (2)
- Data management technology for decision support systems (2004) (2)
- Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and Opportunities (2022) (1)
- Secret-Key Encryption (2009) (1)
- Query portals: dynamically generating portals for entity-oriented web queries (2010) (1)
- Review - Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers (2000) (1)
- Document Identifier (2020) (1)
- Proceedings of the ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, June 27-29, 2006 (2006) (1)
- PACk: An Efficient Partition-based Distributed Agglomerative Hierarchical Clustering Algorithm for Deduplication (2022) (1)
- Detecting redundant tuples during query evaluation (1991) (1)
- Query Portals: Dynamically Generating Portals for Web (2009) (1)
- Architectural Concepts for Large Knowledge Bases (1987) (1)
- Introduction to ACM SIGMOD 2006 conference papers (2007) (1)
- To do or not to do: extending SQL with integer linear programming?: technical perspective (2019) (1)
- The Microsoft database research group (1998) (1)
- Database Tuning using Trade-off Elimination (2009) (1)
- PACk (2022) (1)
- Report on the Second International Workshop on Self-Managing Database Systems (SMDB 2007) (2007) (1)
- Where is Business Intelligence taking today's Database Systems? (2004) (1)
- Statistical Decision Techniques (2009) (1)
- Foundations of Automated Database Tuning (Tutorial) (2006) (1)
- Query Optimization at the Crossroads (Panel) (1997) (1)
- Service Choreography (2009) (0)
- DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning (2022) (0)
- Call for Participation First ACM Symposium on Cloud Computing (SOCC) (2010) (0)
- Data warehousing (2003) (0)
- Digital Surface Model (2009) (0)
- VLDB Endowment Board of Trustees (2007) (0)
- Cloud data systems (2022) (0)
- Service Item (2020) (0)
- Review - Physical Database Design for Relational Databases (2000) (0)
- Special issue on best papers of VLDB 2016 (2018) (0)
- Distance-Preserving Mapping (2009) (0)
- Decentralized Data Integration System (2009) (0)
- Data Perturbation (2009) (0)
- Call for Papers First ACM Symposium on Cloud Computing (SOCC) (2010) (0)
- Structured data and web documents: better together? (2009) (0)
- High-Performance Row Pattern Recognition Using Joins (2023) (0)
- Data Types for Moving Objects (2009) (0)
- Query optimization at the crossroads (1997) (0)
- Degrees of Consistency (2009) (0)
- Storage Array (2009) (0)
- New And Forgotten Dreams In Database Research (1997) (0)
- Session details: Keynote 2 (2010) (0)
- New and Forgotten Dreams in Database Research (Panel) (1997) (0)
- Issues in Decision Support over SQL Databases (1997) (0)
- Automatically Tagging Entities with Descriptive Phrases (2011) (0)
- Method and apparatus for query optimization in a relational database system with external functions (1994) (0)
- Cloud Data Systems: What are the Opportunities for the Database Research Community? (2022) (0)
- Technical perspective: To do or not to do: extending SQL with integer linear programming? (2019) (0)
- Hyperspace (2021) (0)
- Self-tuning Database Systems: Past, Present and Future (2008) (0)
- Guest Editors Introduction: Special Section on Keyword Search on Structured Data (2011) (0)
- ABC: Efficient Selection of Machine Learning Configuration on Large Dataset (2018) (0)
- An Axiomatic Framework for Discovering Entity Synonyms (2011) (0)
- Guest Editorial (2004) (0)
- Spatial Analysis (2018) (0)
- Special issue on best papers of VLDB 2016 (2018) (0)
- Inquiry optimization method and device in relational data base system provided with external function (1994) (0)
- Data-induced predicates for sideways information passing in query optimizers (2021) (0)
- Technical Perspective: Reflections on Extending SQL using Constraints (2017) (0)
- Keyword Search on Structured Data (2011) (0)
- Future Directions in Database Research (Panel) (1998) (0)
- Data Cache (2009) (0)
- DiagnosingEstimation ErrorsinPage Counts Using ExecutionFeedback (2008) (0)
- Table of Contents (pdf) (2007) (0)
- Statistical Disclosure Control (SDC) (2009) (0)
- Accurate Query Optimization by Sub-plan Memoization (1999) (0)
- Foundations of Foundations of Automated Database Tuning Automated Database Tuning (2006) (0)
- Big Data and Enterprise Analytics (2013) (0)
- Database Types: A Plea for Simplicity (A Naive Semantics of Subtyping) (1990) (0)
- BI technologies are essential to running today ’ s businesses and this technology is going through sea changes (0)
- Data Errors (2022) (0)
- Runtime redundancy properties of datalog programs (1994) (0)
- Data and knowledge in database systems: multidimensional databases and online analytical processing (2002) (0)
- Decision Support Queries: A solved problem? (2008) (0)
- Review - Querying Multiple Features of Groups in Relational Databases (2000) (0)
- International Workshop on Human-In-the-Loop Data Analytics (HILDA) (2019) (0)
- Session details: Applications (2010) (0)
- Exploiting Web Search Engines to Search Structured Information (2009) (0)
- A Case Study on A Miner Dataset: Identifying leading research through various Models (2019) (0)
- Auto-Tag: Tagging-Data-By-Example in Data Lakes (2021) (0)
- Data Tracking (2009) (0)
This paper list is powered by the following services: