Haixun Wang
#147,118
Most Influential Person Now
Haixun Wang's AcademicInfluence.com Rankings
Haixun Wangcomputer-science Degrees
Computer Science
#7467
World Rank
#7864
Historical Rank
Data Mining
#178
World Rank
#179
Historical Rank
Machine Learning
#2758
World Rank
#2793
Historical Rank
Database
#4516
World Rank
#4695
Historical Rank

Download Badge
Computer Science
Haixun Wang's Degrees
- PhD Computer Science Stanford University
- Masters Computer Science Stanford University
- Bachelors Computer Science Tsinghua University
Similar Degrees You Can Earn
Why Is Haixun Wang Influential?
(Suggest an Edit or Addition)Haixun Wang's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Mining concept-drifting data streams using ensemble classifiers (2003) (1468)
- Probase: a probabilistic taxonomy for text understanding (2012) (778)
- BLINKS: ranked keyword searches on graphs (2007) (627)
- Clustering by pattern similarity in large data sets (2002) (571)
- Managing and Mining Graph Data (2010) (546)
- Trinity: a distributed graph engine on a memory cloud (2013) (463)
- Efficient Subgraph Matching on Billion Node Graphs (2012) (375)
- /spl delta/-clusters: capturing subspace correlation in a large data set (2002) (364)
- Moment: maintaining closed frequent itemsets over a stream sliding window (2004) (354)
- Enhanced biclustering on expression data (2003) (346)
- ViST: a dynamic index method for querying XML data by tree structures (2003) (343)
- A Distributed Graph Engine for Web Scale RDF Data (2013) (343)
- Landmarks: a new model for similarity-based pattern querying in time series databases (2000) (325)
- Dual Labeling: Answering Graph Reachability Queries in Constant Time (2006) (258)
- Natural language question answering over RDF: a graph data driven approach (2014) (250)
- Short Text Conceptualization Using a Probabilistic Knowledgebase (2011) (248)
- Integrity Auditing of Outsourced Data (2007) (231)
- Local search of communities in large graphs (2014) (225)
- KBQA: Learning Question Answering over QA Corpora and Knowledge Bases (2019) (202)
- Catch the moment: maintaining closed frequent itemsets over a data stream sliding window (2006) (190)
- Distance-Constraint Reachability Computation in Uncertain Graphs (2011) (180)
- Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (2018) (176)
- Understanding Tables on the Web (2012) (173)
- Fast Graph Pattern Matching (2008) (173)
- Efficiently answering reachability queries on very large directed graphs (2008) (172)
- Online search of overlapping communities (2013) (165)
- Query Languages and Data Models for Database Sequences and Data Streams (2004) (163)
- Automatic taxonomy construction from keywords (2012) (158)
- Is random model better? On its accuracy and efficiency (2003) (156)
- GString: A Novel Approach for Efficient Search in Graph Databases (2007) (153)
- Knowledge-Based Approaches to Concept-Level Sentiment Analysis (2013) (135)
- Short text understanding through lexical-semantic analysis (2015) (134)
- Online Mining of Changes from Data Streams: Research Problems and Preliminary Results (2003) (130)
- MaPle: a fast algorithm for maximal pattern-based clustering (2003) (128)
- Active Mining of Data Streams (2004) (126)
- Semantic Multidimensional Scaling for Open-Domain Sentiment Analysis (2014) (114)
- Computing label-constraint reachability in graph databases (2010) (107)
- On the sequencing of tree structures for XML indexing (2005) (106)
- How to partition a billion-node graph (2014) (106)
- ATLAS: A Small but Complete SQL Extension for Data Mining and Data Streams (2003) (104)
- A data stream language and system designed for power and extensibility (2006) (104)
- Leveraging spatio-temporal redundancy for RFID data cleansing (2010) (98)
- Fast Computation of Reachability Labeling for Large Graphs (2006) (98)
- Online Anomaly Prediction for Robust Cluster Systems (2009) (97)
- Guest Editorial: Big Social Data Analysis (2014) (95)
- Fast computing reachability labelings for large graphs with high compression rate (2008) (94)
- Discovering Frequent Closed Partial Orders from Strings (2006) (92)
- Challenges and Experience in Prototyping a Multi-Modal Stream Analytic and Monitoring Application on System S (2007) (91)
- A Survey of Clustering Algorithms for Graph Data (2010) (90)
- Efficient subgraph search over large uncertain graphs (2011) (87)
- Statistical Approaches to Concept-Level Sentiment Analysis (2013) (83)
- Graph Data Management and Mining: A Survey of Algorithms and Applications (2010) (83)
- Learning Term Embeddings for Hypernymy Identification (2015) (81)
- Beyond ten blue links: enabling user click modeling in federated web search (2012) (80)
- Understand Short Texts by Harvesting and Analyzing Semantic Knowledge (2017) (79)
- Adaptive system anomaly prediction for large-scale hosting infrastructures (2010) (78)
- Computing Compressed Multidimensional Skyline Cubes Efficiently (2007) (78)
- Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases (2012) (78)
- An Improved Biclustering Method for Analyzing Gene Expression Profiles (2005) (78)
- Path-tree: An efficient reachability indexing scheme for large directed graphs (2011) (78)
- The Trinity Graph Engine (2012) (78)
- Finding semantics in time series (2011) (72)
- Query Understanding through Knowledge-Based Conceptualization (2015) (71)
- Pruning and dynamic scheduling of cost-sensitive ensembles (2002) (70)
- Computing term similarity by large probabilistic isA knowledge (2013) (69)
- Using SQL to Build New Aggregates and Extenders for Object- Relational Systems (2000) (68)
- Attribute extraction and scoring: A probabilistic approach (2013) (68)
- A Survey of Algorithms for Keyword Search on Graph Data (2010) (67)
- ATLaS: A Native Extension of SQL for Data Mining (2003) (67)
- The deductive database system [Lscr ][Dscr ][Lscr ]++ (2002) (66)
- Finding global icebergs over distributed data sets (2006) (65)
- Managing and mining large graphs: systems and implementations (2012) (65)
- Providing freshness guarantees for outsourced databases (2008) (64)
- Text Mining in Social Networks (2011) (63)
- Loadstar: A Load Shedding Scheme for Classifying Data Streams (2005) (63)
- Dual encryption for query integrity assurance (2008) (60)
- Mining Data Streams (2005) (59)
- Context-Dependent Conceptualization (2013) (59)
- Location-Based Spatial Query Processing with Data Sharing in Wireless Broadcast Environments (2008) (58)
- Suppressing model overfitting in mining concept-drifting data streams (2006) (58)
- On Anomalous Hotspot Discovery in Graph Streams (2013) (58)
- Loadstar: Load Shedding in Data Stream Mining (2005) (57)
- MapDupReducer: detecting near duplicates over massive datasets (2010) (57)
- An online cost sensitive decision-making method in crowdsourcing systems (2013) (55)
- An Inference Approach to Basic Level of Categorization (2015) (54)
- Stop Chasing Trends: Discovering High Order Models in Evolving Data (2008) (53)
- K-Reach: Who is in Your Small World (2012) (50)
- Efficient Keyword Search on Uncertain Graph Data (2013) (50)
- Supporting ranking and clustering as generalized order-by and group-by (2007) (49)
- Compact reachability labeling for graph-structured data (2005) (49)
- An algorithmic approach to event summarization (2010) (46)
- Toward a Distance Oracle for Billion-Node Graphs (2013) (44)
- A fast algorithm for subspace clustering by pattern similarity (2004) (44)
- Query Integrity Assurance of Location-Based Services Accessing Outsourced Spatial Databases (2009) (44)
- A native extension of SQL for mining data streams (2005) (41)
- Adaptive Load Diffusion for Multiway Windowed Stream Joins (2007) (41)
- Advances in Web and Network Technologies, and Information Management, APWeb/WAIM 2007 International Workshops: DBMAN 2007, WebETrends 2007, PAIS 2007 and ASWAN 2007, Huang Shan, China, June 16-18, 2007, Proceedings (2007) (41)
- A Low-Granularity Classifier for Data Streams with Concept Drifts and Biased Class Distribution (2007) (41)
- Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes (2011) (41)
- Relational languages and data models for continuous queries on sequences and data streams (2011) (40)
- Web scale taxonomy cleansing (2011) (39)
- Indexing weighted-sequences in large databases (2003) (39)
- Identifying users' topical tasks in web search (2013) (38)
- A Sampling-Based Approach to Information Recovery (2008) (38)
- Event summarization for system management (2007) (37)
- Understanding Short Texts through Semantic Enrichment and Hashing (2016) (37)
- Understanding Short Texts (2013) (34)
- User defined aggregates in object-relational systems (2000) (34)
- Incorporating post-click behaviors into a click model (2010) (33)
- Towards a Probabilistic Taxonomy of Many Concepts (2011) (32)
- CMP: a fast decision tree classifier using multivariate predictions (2000) (31)
- A Bayesian Inference-Based Framework for RFID Data Cleansing (2013) (31)
- Logic-Based User-Defined Aggregates for the Next Generation of Database Systems (1999) (31)
- A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity (2015) (31)
- A query integrity assurance scheme for accessing outsourced spatial databases (2012) (31)
- Graph similarity search on large uncertain graph databases (2015) (30)
- Mining Concept-Drifting Data Streams (2010) (30)
- Clustering by Pattern Similarity (2008) (29)
- Open Domain Short Text Conceptualization: A Generative + Descriptive Modeling Approach (2015) (29)
- Efficiently Monitoring Top-k Pairs over Sliding Windows (2012) (29)
- Graph-Based Wrong IsA Relation Detection in a Large-Scale Lexical Taxonomy (2017) (28)
- Automatic Taxonomy Construction from Keywords via Scalable Bayesian Rose Trees (2015) (26)
- Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model (2021) (26)
- Wikification via link co-occurrence (2013) (26)
- Isanette: A Common and Common Sense Knowledge Base for Opinion Mining (2011) (25)
- Semantic queries by example (2013) (24)
- Nonmonotonic reasoning in LDL (2000) (24)
- Automatic extraction of top-k lists from the web (2013) (24)
- A Balanced Ensemble Approach to Weighting Classifiers for Text Classification (2006) (24)
- Head, modifier, and constraint detection in short texts (2014) (23)
- On Conceptual Labeling of a Bag of Words (2015) (22)
- Discovery in multi-attribute data with user-defined constraints (2002) (22)
- Data-Driven Metaphor Recognition and Explanation (2013) (22)
- Empirical comparison of various reinforcement learning strategies for sequential targeted marketing (2002) (22)
- On reducing classifier granularity in mining concept-drifting data streams (2005) (21)
- The S2-Tree : An Index Structure for Subsequence Matching of Spatial Objects (2001) (21)
- Understanding short texts through semantic enrichment and hashing (2016) (21)
- Mining Extremely Skewed Trading Anomalies (2004) (21)
- Proceedings of the 21st ACM international conference on Information and knowledge management (2012) (20)
- Efficiently mining frequent closed partial orders (2005) (20)
- Location-based Spatial Queries with Data Sharing in Wireless Broadcast Environments (2007) (20)
- User-Defined Aggregates in Database Languages (1999) (20)
- An Integrated Data-Driven Framework for Computing System Management (2010) (18)
- Asymmetric signature schemes for efficient exact edit similarity query processing (2013) (18)
- Hub-Accelerator: Fast and Exact Shortest Path Computation in Large Social Networks (2013) (17)
- Shallow Information Extraction for the knowledge Web (2013) (17)
- Semantic queries in databases: problems and challenges (2009) (17)
- XSeq: an indexing infrastructure for tree pattern queries (2004) (17)
- Entity Disambiguation based on a Probabilistic Taxonomy (2011) (17)
- Proxies for Shortest Path and Distance Queries (2016) (17)
- A Transfer-Learnable Natural Language Interface for Databases (2018) (17)
- On the Transitivity of Hypernym-Hyponym Relations in Data-Driven Lexical Taxonomies (2017) (17)
- Optimizing index for taxonomy keyword search (2012) (16)
- Lock-free consistency control for web 2.0 applications (2008) (16)
- Optimizing Timestamp Management in Data Stream Management Systems (2007) (16)
- Efficient processing of $$k$$k-hop reachability queries (2014) (16)
- A unified approach for computing top-k pairs in multidimensional space (2011) (16)
- G-SQL: Fast Query Processing via Graph Exploration (2016) (15)
- A system for extracting top-K lists from the web (2012) (15)
- Demand-driven frequent itemset mining using pattern structures (2005) (15)
- LinkProbe: Probabilistic inference on large-scale social networks (2013) (15)
- Toward Topic Search on the Web (2011) (15)
- Semantic Data Management: Towards Querying Data with their Meaning (2007) (14)
- Probase+: Inferring Missing Links in Conceptual Taxonomies (2017) (14)
- Transfer Understanding from Head Queries to Tail Queries (2014) (14)
- A Generic Framework for Top-k Pairs and Top-k Objects Queries over Sliding Windows (2012) (13)
- FARM: a framework for exploring mining spaces with multiple attributes (2001) (13)
- Preference-Based Frequent Pattern Mining (2005) (12)
- A fully distributed framework for cost-sensitive data mining (2002) (12)
- Querying uncertain data with aggregate constraints (2011) (12)
- A Framework for Scalable Cost-sensitive Learning Based on Combing Probabilities and Benefits (2002) (12)
- Efficient Computation of Range Aggregates against Uncertain Location-Based Queries (2012) (12)
- Learning to rank with a novel kernel perceptron method (2009) (12)
- Unifying Data and Domain Knowledge Using Virtual Views (2007) (12)
- Employing Semantic Context for Sparse Information Extraction Assessment (2018) (12)
- Probase : a Universal Knowledge Base for Semantic Search (2010) (11)
- Optimizing content freshness of relations extracted from the web using keyword search (2010) (11)
- Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (Extended Abstract) (2018) (11)
- Online mining of data streams: applications, techniques and progress (2005) (10)
- User Defined Aggregates for Logical Data Languages (1998) (10)
- From Intrinsic to Counterfactual: On the Explainability of Contextualized Recommender Systems (2021) (10)
- ATLaS: a Turing-Complete Extension of SQL for Data Mining Applications and Streams (2002) (9)
- A Generic Framework for Top-${\schmi k}$ Pairs and Top- ${\schmi k}$ Objects Queries over Sliding Windows (2014) (9)
- Concept-Based Web Search (2012) (9)
- Learning Defining Features for Categories (2016) (9)
- Scaling Up Markov Logic Probabilistic Inference for Social Graphs (2017) (8)
- User-Defined Aggregates for Datamining (1999) (8)
- On dimensionality reduction of massive graphs for indexing and retrieval (2011) (8)
- Concept Clustering of Evolving Data (2009) (8)
- Database System Extensions for Decision Support: the AXL Approach (2000) (8)
- Extending SQL for Decision Support Applications (2002) (8)
- Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach (2007) (7)
- Adversarial Robustness through Bias Variance Decomposition: A New Perspective for Federated Learning (2020) (7)
- Recent progress on selected topics in database research — A report by nine young Chinese researchers working in the United States (2003) (7)
- Verb Pattern: A Probabilistic Semantic Representation on Verbs (2016) (7)
- Web Scale Entity Resolution using Relational Evidence (2011) (7)
- SSDT: a scalable subspace-splitting classifier for biased data (2001) (7)
- Syntactic Parsing of Web Queries (2016) (7)
- SpatialNLI: A Spatial Domain Natural Language Interface to Databases Using Spatial Comprehension (2019) (7)
- An Introduction to Graph Data (2010) (7)
- A Random Method for Quantifying Changing Distributions in Data Streams (2005) (7)
- Fine-Grained Semantic Conceptualization of FrameNet (2016) (6)
- Distance Landmarks Revisited for Road Graphs (2014) (6)
- RASIM: a rank-aware separate index method for answering top-k spatial keyword queries (2013) (6)
- Overcoming Semantic Drift in Information Extraction (2014) (6)
- A Natural Language Interface for Database: Achieving Transfer-learnability Using Adversarial Method for Question Understanding (2020) (6)
- Entity Suggestion with Conceptual Expanation (2017) (6)
- Weighted Proximity Best-Joins for Information Retrieval (2009) (6)
- Pattern-based similarity search for microarray data (2005) (6)
- Estimating the Selectivity of XML Path Expression with Predicates by Histograms (2004) (5)
- Inverse Time Dependency in Convex Regularized Learning (2009) (5)
- Mining associations by pattern structure in large relational tables (2002) (5)
- Toward Extensible Spatio-Temporal Databases: An Approach Based on User-Defined Aggregates (2004) (5)
- Time-Stamp Management and Query Execution in Data Stream Management Systems (2008) (5)
- Link-based hidden attribute discovery for objects on Web (2011) (5)
- Fast Relevance Discovery in Time Series (2006) (5)
- A Flexible Query Graph Based Model for the Efficient Execution of Continuous Queries (2007) (5)
- Proceedings of the first international workshop on Cloud data management (2009) (5)
- Reachability Computation in Uncertain Graphs (2011) (4)
- Assessing sparse information extraction using semantic contexts (2013) (4)
- A System Framework for Web Service Semantic and Automatic Orchestration (2007) (4)
- Proxies for Shortest Path and Distance Queries (2017) (4)
- Diagnosing and Minimizing Semantic Drift in Iterative Bootstrapping Extraction (2018) (4)
- WiiCluster: a Platform for Wikipedia Infobox Generation (2014) (4)
- Cleansing uncertain databases leveraging aggregate constraints (2010) (4)
- Knowledge Graph and Semantic Computing: Semantic, Knowledge, and Linked Big Data (2016) (4)
- Progressive modeling (2002) (4)
- Unsupervised Head--Modifier Detection in Search Queries (2016) (3)
- G-Index Model: A generic model of index schemes for top-k spatial-keyword queries (2015) (3)
- A Unified Framework for Answering k Closest Pairs Queries and Variants (2014) (3)
- How to Make a Semantic Network Probabilistic (2014) (3)
- Distance Oracle on Billion Node Graphs (2014) (3)
- Incompleteness of Database Languages for Data Streams and Data Mining: the Problem and the Cure (2003) (3)
- Inferencing in information extraction: Techniques and applications (2015) (3)
- ComputingLabel-ConstraintReachabilityinGraph Databases (2010) (3)
- User-directed exploration of mining space with multiple attributes (2002) (3)
- An index structure for pattern similarity searching in DNA microarray data (2002) (3)
- Trinity Graph Engine and its Applications (2017) (3)
- Stay Current and Relevant in Data Mining Research (2005) (2)
- The Links Have It: Infobox Generation by Summarization over Linked Entities (2014) (2)
- Learning Knowledge Bases for Text and Multimedia (2014) (2)
- Learning Knowledge Bases for Multimedia in 2015 (2015) (2)
- Semantic Bootstrapping: A Theoretical Perspective (2017) (2)
- Entity Suggestion by Example using a Conceptual Taxonomy (2015) (2)
- Tensor-based Complementary Product Recommendation (2021) (2)
- Finding information nebula over large networks (2011) (2)
- User-deened Aggregates for Datamining (1999) (2)
- ESL: a Very Powerful SQL-Compliant Data Stream Language (2005) (2)
- Extending Relational Query Languages for Data Streams (2016) (2)
- Modeling and Querying E-Commerce Data in Hybrid Relational-XML DBMSs (2008) (1)
- ESL : a Data Stream Query Language and System Designed for Power and Extensibility (2004) (1)
- Inductive Learning in Less Than One Sequential Data Scan (2003) (1)
- Semantic Bootstrapping: A Theoretical Perspective (2017) (1)
- Near-Neighbor Search in Pattern Distance Spaces (2005) (1)
- Challenges in Managing and Mining Large, Heterogeneous Data (2011) (1)
- SpatialNLI (2019) (1)
- Report on the first international workshop on cloud data management (CloudDB 2009) (2010) (1)
- Automatic Navbox Generation by Interpretable Clustering over Linked Entities (2017) (1)
- Community Search in Dynamic Social Networks (2013) (0)
- FastGraph Pattern Matching (2008) (0)
- Extraction of Reliable and Actionable Information from Social Media During Emergencies (2022) (0)
- Mining Robust Overlapping Co-Clustering in the Presence of Noise (2020) (0)
- Proceedings of the APWeb/WAIM 2007 DBMAN, WebETrends, PAIS and ASWAN international workshops on Advances in Web and Network Technologies, and Information Management (2007) (0)
- Program committee chairs' welcome (2012) (0)
- Compression of Weighted Graphs (0)
- Syntactic models for parsing search queries on online social networks (2016) (0)
- Network Compression by Node and Edge Mergers (2012) (0)
- Proceedings of the First International CIKM Workshop on Cloud Data Management, CloudDB 2009, Hong Kong, China, November 2, 2009 (2009) (0)
- Special Issue on Cross-Layer Support for Database Management (2019) (0)
- A Short Survey on the User Cold Start Problem in Recommender Systems: Metadata and Meta-Learning Methods (2022) (0)
- RASIM: a rank-aware separate index method for answering top-k spatial keyword queries (2012) (0)
- Guest Editorial: Special Issue on Managing and Mining Massive Graphs (2015) (0)
- Efficient processing of k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-hop reachability queries (2013) (0)
- User-Defined Aggregates for Advanced Database Applications (2000) (0)
- CloudDB workshop summary (2009) (0)
- Query Suggestion by Concept Instantiation (2013) (0)
- Letter from the Editor-in-Chief (2001) (0)
- A conversation with MSRA researchers (2012) (0)
- LOCI: Load Shedding through Class-Preserving Data Acquisition (2006) (0)
- Summarization of Weighted Networks (2017) (0)
- A Monte Carlo Sampling Framework for Information Recovery (2007) (0)
- Dynamic Embedding-based Retrieval for Personalized Item Recommendations at Instacart (2023) (0)
- Letter from the Special Issue Editor (2021) (0)
- Session details: Special issue on big data analytics workshop (2014) (0)
- Rethink e-Commerce Search (2022) (0)
- Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey (2016) (0)
- Computing Term Similarity by Knowledge from Big Data (2013) (0)
- Graph similarity search on large uncertain graph databases (2014) (0)
- Toward Query-centric Web Modeling and Crawling (2011) (0)
- Efficiently Mining Frequent Closed Partial Orders ( Extended Abstract ) (2005) (0)
- Research on the improvement of the harmony search algorithm (2015) (0)
- The ATLaS system and its powerful database language based on simple extensions of SQL (2002) (0)
- Guest Editorial Special Issue on Concept-Level Opinion and Sentiment Analysis (2012) (0)
- On Mining Maximal Pattern-Based Clusters (2009) (0)
- Theory and Practice of Temporal Data Mining ( TPTDM 2006 ) (2006) (0)
- Distributed Big Graph Caching (2013) (0)
- G-Index Model: A generic model of index schemes for top-k spatial-keyword queries (2014) (0)
- Multilingual spatial domain natural language interface to databases (2023) (0)
This paper list is powered by the following services: