Sunita Sarawagi
#157,426
Most Influential Person Now
Sunita Sarawagi's AcademicInfluence.com Rankings
Sunita Sarawagicomputer-science Degrees
Computer Science
#8747
World Rank
#9197
Historical Rank
Information Technology
#61
World Rank
#62
Historical Rank
Information Management
#104
World Rank
#106
Historical Rank
Machine Learning
#3665
World Rank
#3711
Historical Rank

Download Badge
Computer Science
Sunita Sarawagi's Degrees
- PhD Computer Science Stanford University
Similar Degrees You Can Earn
Why Is Sunita Sarawagi Influential?
(Suggest an Edit or Addition)Sunita Sarawagi's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Interactive deduplication using active learning (2002) (769)
- Discriminative Methods for Multi-labeled Classification (2004) (763)
- Learning with Graphical Models (2008) (757)
- Semi-Markov Conditional Random Fields for Information Extraction (2004) (734)
- On the Computation of Multidimensional Aggregates (1996) (643)
- Modeling multidimensional databases (1997) (553)
- Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications (1998) (489)
- Annotating and searching web tables using entities, types and relationships (2010) (421)
- Information Extraction (2008) (411)
- Discovery-Driven Exploration of OLAP Data Cubes (1998) (393)
- Efficient set joins on similarity predicates (2004) (375)
- Record linkage: similarity measures and algorithms (2006) (348)
- Generalizing Across Domains via Cross-Gradient Training (2018) (338)
- Automatically Extracting Structure from Free Text Addresses (2000) (315)
- Efficient organization of large multidimensional arrays (1994) (304)
- Automatic segmentation of text into structured records (2001) (265)
- Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods (2004) (262)
- The Claremont report on database research (2008) (238)
- Mining Surprising Patterns Using Temporal Description Length (1998) (163)
- Creating probabilistic databases from information extraction models (2006) (163)
- Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings (2018) (153)
- User-Adaptive Exploration of Multidimensional Data (2000) (151)
- Explaining Differences in Multidimensional Aggregates (1999) (146)
- Answering Table Queries on the Web using Column Keywords (2012) (140)
- Domain Adaptation of Conditional Probability Models Via Feature Subsetting (2007) (130)
- Integrating Unstructured Data into Relational Databases (2006) (125)
- Intelligent Rollups in Multidimensional OLAP Data (2001) (124)
- Answering Table Augmentation Queries from Unstructured Lists on the Web (2009) (119)
- Functional sites in protein families uncovered via an objective and automated graph theoretic approach. (2003) (111)
- Efficient Domain Generalization via Common-Specific Low-Rank Decomposition (2020) (111)
- Indexing OLAP Data (1997) (111)
- Integrating Mining with Relational Database Systems: Alternatives and Implications (1998) (109)
- Sequence Data Mining (2005) (108)
- Parallel Iterative Edit Models for Local Sequence Transduction (2019) (84)
- On computing the data cube (1996) (82)
- Mining Generalized Association Rules and Sequential Patterns Using SQL Queries (1998) (76)
- Document Classification Through Interactive Supervision of Document and Term Labels (2004) (69)
- Scaling multi-class support vector machines using inter-class confusion (2002) (67)
- Efficient Batch Top-k Search for Dictionary-based Entity Recognition (2006) (67)
- Calibration of Encoder Decoder Models for Neural Machine Translation (2019) (63)
- Cross-training: learning probabilistic mappings between topics (2003) (62)
- Efficient inference with cardinality-based clique potentials (2007) (62)
- Query Processing in Tertiary Memory Databases (1995) (60)
- Maximum Mean Discrepancy for Class Ratio Estimation: Convergence Bounds and Kernel Selection (2014) (58)
- Learning from Rules Generalizing Labeled Exemplars (2020) (55)
- The Claremont report on database research (2009) (54)
- Data mining models as services on the internet (2000) (51)
- Open-domain quantity queries on web tables: annotation, response, and consensus models (2014) (50)
- Numerical Relation Extraction with Minimal Supervision (2016) (45)
- Accurate max-margin training for structured output spaces (2008) (42)
- Efficient evaluation of queries with mining predicates (2002) (40)
- Surprisingly Easy Hard-Attention for Sequence to Sequence Learning (2018) (37)
- User-cognizant multidimensional analysis (2001) (35)
- Reordering Query Execution in Tertiary Memory Databases (1996) (35)
- Length bias in Encoder Decoder Models and a Case for Global Conditioning (2016) (35)
- i3: Intelligent, Interactive Investigaton of OLAP data cubes (2000) (34)
- i3: intelligent, interactive investigation of OLAP data cubes (2000) (32)
- Automation in Information Extraction and Data Integration (2002) (31)
- Posterior Attention Models for Sequence to Sequence Learning (2018) (28)
- Database systems for efficient access to tertiary memory (1995) (26)
- ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting (2018) (25)
- Scalable Information Extraction and Integration. (2006) (24)
- Domain adaptation of information extraction models (2009) (23)
- Active Evaluation of Classifiers on Large Datasets (2012) (23)
- iDiff: Informative Summarization of Differences in Multidimensional Aggregates (2001) (23)
- Efficient inference on sequence segmentation models (2006) (23)
- Connectionist Model (2009) (22)
- Joint training for open-domain extraction on the web: exploiting overlap when supervision is limited (2011) (21)
- Probabilistic Graphical Models and their Role in Databases (2007) (21)
- Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008 (2008) (21)
- Discovering Structure in the Universe of Attribute Names (2016) (21)
- Enhancing Search with Structure (2010) (20)
- Higher-order Graphical Models for Classification in Social and Affiliation Networks (2010) (19)
- Biography and Position Statement. (2010) (18)
- Continual Learning with Neural Networks: A Review (2019) (17)
- Correlation Clustering (2009) (17)
- Mining Subjective Properties on the Web (2015) (17)
- Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study (2021) (16)
- What’s in a Name? Are BERT Named Entity Representations just as Good for any other Name? (2020) (16)
- Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration (2021) (16)
- Factorizing complex predicates in queries to exploit indexes (2003) (16)
- ALIAS: An Active Learning led Interactive Deduplication System (2002) (15)
- Conditional Routing (2009) (14)
- Single Query Optimization for Tertiary Memory (1993) (13)
- Robust Data Programming with Precision-guided Labeling Functions (2020) (13)
- Missing Value Imputation on Multidimensional Time Series (2021) (13)
- Training Data Augmentation for Code-Mixed Translation (2021) (13)
- A few good predictions: selective node labeling in a social network (2014) (12)
- Inter-class relationships in text classification (2006) (10)
- Learning to extract information from large websites using sequential models (2005) (10)
- Letter from the Special Issue Editor (2000) (9)
- Focus on the Common Good: Group Distributional Robustness Follows (2021) (9)
- Streaming Adaptation of Deep Forecasting Models using Adaptive Recurrent Units (2019) (9)
- Data Programming using Continuous and Quality-Guided Labeling Functions (2019) (8)
- Long Horizon Forecasting with Temporal Point Processes (2021) (8)
- Black-box Adaptation of ASR for Accented Speech (2020) (8)
- Resolving citations in a paper repository (2003) (8)
- Collective Inference for Extraction MRFs Coupled with Symmetric Clique Potentials (2010) (8)
- Execution Reordering for Tertiary Memory Access (1997) (7)
- Privacy-preserving Class Ratio Estimation (2016) (7)
- Error-Driven Fixed-Budget ASR Personalization for Accented Speakers (2021) (7)
- Deep Indexed Active Learning for Matching Heterogeneous Entity Representations (2021) (7)
- Efficient Organization of Large Multidimensional (1993) (7)
- Extracting predicates from mining models for efficient query evaluation (2004) (7)
- Answering Web Questions Using Structured Data - Dream or Reality? (2009) (6)
- Efficient top-k count queries over imprecise duplicates (2009) (6)
- Joint Structured Models for Extraction from Overlapping Sources (2010) (6)
- ABSTRACT Efficient set joins on similarity predicates (2004) (5)
- Models and Indices for Integrating Unstructured Data with a Relational Database (2004) (5)
- Content-based Retrieval (2009) (5)
- Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time (2021) (5)
- Labeled Memory Networks for Online Model Adaptation (2017) (5)
- Generalized Collective Inference with Symmetric Clique Potentials (2009) (4)
- Conceptual Data Model (2009) (4)
- Queries over Unstructured Data: Probabilistic Methods to the Rescue - (Keynote) (2009) (4)
- Text classification with evolving label-sets (2005) (4)
- Sequence data mining techniques and applications (2003) (3)
- Learning to extract information from large domain-specific websites using sequential models (2004) (3)
- Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages (2022) (3)
- The Claremont Report on Database (3)
- Scaling up the ALIAS duplicate elimination system: a demonstration (2003) (3)
- Label Organized Memory Augmented Neural Network (2017) (3)
- MAP estimation in Binary MRFs via Bipartite Multi-cuts (2010) (3)
- Computer Human Interaction (CHI) (2009) (2)
- Building Classifiers With Unrepresentative Training Instances : Experiences From The KDD Cup 2001 Competition (2002) (2)
- HIClass: Hyper-interactive Text Classification by Interactive Supervision of Document and Term Labels (2004) (2)
- Scaling up the ALIAS Duplicate Elimination System. (2003) (2)
- Special issue on best papers of VLDB 2011 (2013) (1)
- Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding (2022) (1)
- Bootstrapping Multilingual Semantic Parsers using Large Language Models (2022) (1)
- Indexing Olap Data 0] Bulletin of the Ieee Computer Society Technical Committee on Data Engineering (1996) (1)
- Adaptive Discounting of Implicit Language Models in RNN-Transducers (2022) (1)
- Cache Performance (2009) (1)
- Data-based research at IIT Bombay (2013) (1)
- Letter from the Research Track Co-Chair (2011) (0)
- Clinical Data Management Systems (2009) (0)
- Semi-Markov Models for Named Entity Recognition (2006) (0)
- Querying for relations from the semi-structured Web (2009) (0)
- Database Mining Integration Seminar Report (2007) (0)
- Structured Case-based Reasoning for Inference-time Adaptation of Text-to-SQL parsers (2023) (0)
- Quality Scoring of Source Words in Neural Translation Models (2022) (0)
- Learning Recourse on Instance Environment to Enhance Prediction Accuracy (2022) (0)
- Computer-based Provider Order Entry (2009) (0)
- Letter from the VLDB 2011 Research Track Co-Chair (2011) (0)
- NLP Service APIs and Models for Efficient Registration of New Clients (2020) (0)
- SIGMOD Officers , Committees , and Awardees ( continued ) (2012) (0)
- VLDB Endowment Board of Trustees (2007) (0)
- Sequence Segmentation Using Semi-Markov Conditional Random Fields (2019) (0)
- NEURAL MACHINE TRANSLATION (2019) (0)
- Occurrence Statistics of Entities, Relations and Types on the Web (2016) (0)
- CROSS-GRADIENT TRAINING (2018) (0)
- Cross-media Information Retrieval (2009) (0)
- Editorial (1985) (0)
- Cooperative Content Distribution (2009) (0)
- Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers (2022) (0)
- Column Segmentation (2009) (0)
- SIGMOD Officers , Committees , and Awardees Chair Vice-Chair Secretary / (2009) (0)
- Long Range Probabilistic Forecasting in Time-Series using High Order Statistics (2021) (0)
- Special issue on best papers of VLDB 2011 (2012) (0)
- Adapting Multilingual Models for Code-Mixed Translation (2022) (0)
- Conceptual Modeling (1999) (0)
- Classification with Evolving Label-sets (2005) (0)
- Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings (2019) (0)
- Coherent Probabilistic Aggregate Queries on Long-horizon Forecasts (2021) (0)
- Efficient graphical models for sequence segmentation (2005) (0)
- Practical methods of Active Learning (2012) (0)
- Cross-lingual Text Mining (2009) (0)
- Reminiscences on Influential Papers (2001) (0)
- AI and data science centers in top Indian academic institutions (2022) (0)
- Classification Tree (2009) (0)
- Deep Learning Methods for Classification with Limited Training Data Seminar Report : Spring 2017 (2017) (0)
- Shared Task Organizing Committee -transliteration Mining: Whitepaper of News 2010 Shared Task on Transliteration Generation Transliteration Generation and Mining with Limited Training Resources Transliteration Using a Phrase-based Statistical Machine Translation System to Re-score the Output of a Jo (2010) (0)
- Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations (2021) (0)
- Current Time (2009) (0)
- Editorial (1998) (0)
- Computational Ontology (2009) (0)
- Conflict Serializability (2009) (0)
This paper list is powered by the following services: