Haixun Wang

Haixun Wang's AcademicInfluence.com Rankings

Haixun Wang

Computer Science

#7467

World Rank

#7864

Historical Rank

Data Mining

#178

World Rank

#179

Historical Rank

Machine Learning

#2758

World Rank

#2793

Historical Rank

Database

#4516

World Rank

#4695

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Haixun Wang's Degrees

PhD Computer Science Stanford University
Masters Computer Science Stanford University
Bachelors Computer Science Tsinghua University

Similar Degrees You Can Earn

Why Is Haixun Wang Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Haixun Wang's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Mining concept-drifting data streams using ensemble classifiers (2003) (1468)
Probase: a probabilistic taxonomy for text understanding (2012) (778)
BLINKS: ranked keyword searches on graphs (2007) (627)
Clustering by pattern similarity in large data sets (2002) (571)
Managing and Mining Graph Data (2010) (546)
Trinity: a distributed graph engine on a memory cloud (2013) (463)
Efficient Subgraph Matching on Billion Node Graphs (2012) (375)
/spl delta/-clusters: capturing subspace correlation in a large data set (2002) (364)
Moment: maintaining closed frequent itemsets over a stream sliding window (2004) (354)
Enhanced biclustering on expression data (2003) (346)
ViST: a dynamic index method for querying XML data by tree structures (2003) (343)
A Distributed Graph Engine for Web Scale RDF Data (2013) (343)
Landmarks: a new model for similarity-based pattern querying in time series databases (2000) (325)
Dual Labeling: Answering Graph Reachability Queries in Constant Time (2006) (258)
Natural language question answering over RDF: a graph data driven approach (2014) (250)
Short Text Conceptualization Using a Probabilistic Knowledgebase (2011) (248)
Integrity Auditing of Outsourced Data (2007) (231)
Local search of communities in large graphs (2014) (225)
KBQA: Learning Question Answering over QA Corpora and Knowledge Bases (2019) (202)
Catch the moment: maintaining closed frequent itemsets over a data stream sliding window (2006) (190)
Distance-Constraint Reachability Computation in Uncertain Graphs (2011) (180)
Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (2018) (176)
Understanding Tables on the Web (2012) (173)
Fast Graph Pattern Matching (2008) (173)
Efficiently answering reachability queries on very large directed graphs (2008) (172)
Online search of overlapping communities (2013) (165)
Query Languages and Data Models for Database Sequences and Data Streams (2004) (163)
Automatic taxonomy construction from keywords (2012) (158)
Is random model better? On its accuracy and efficiency (2003) (156)
GString: A Novel Approach for Efficient Search in Graph Databases (2007) (153)
Knowledge-Based Approaches to Concept-Level Sentiment Analysis (2013) (135)
Short text understanding through lexical-semantic analysis (2015) (134)
Online Mining of Changes from Data Streams: Research Problems and Preliminary Results (2003) (130)
MaPle: a fast algorithm for maximal pattern-based clustering (2003) (128)
Active Mining of Data Streams (2004) (126)
Semantic Multidimensional Scaling for Open-Domain Sentiment Analysis (2014) (114)
Computing label-constraint reachability in graph databases (2010) (107)
On the sequencing of tree structures for XML indexing (2005) (106)
How to partition a billion-node graph (2014) (106)
ATLAS: A Small but Complete SQL Extension for Data Mining and Data Streams (2003) (104)
A data stream language and system designed for power and extensibility (2006) (104)
Leveraging spatio-temporal redundancy for RFID data cleansing (2010) (98)
Fast Computation of Reachability Labeling for Large Graphs (2006) (98)
Online Anomaly Prediction for Robust Cluster Systems (2009) (97)
Guest Editorial: Big Social Data Analysis (2014) (95)
Fast computing reachability labelings for large graphs with high compression rate (2008) (94)
Discovering Frequent Closed Partial Orders from Strings (2006) (92)
Challenges and Experience in Prototyping a Multi-Modal Stream Analytic and Monitoring Application on System S (2007) (91)
A Survey of Clustering Algorithms for Graph Data (2010) (90)
Efficient subgraph search over large uncertain graphs (2011) (87)
Statistical Approaches to Concept-Level Sentiment Analysis (2013) (83)
Graph Data Management and Mining: A Survey of Algorithms and Applications (2010) (83)
Learning Term Embeddings for Hypernymy Identification (2015) (81)
Beyond ten blue links: enabling user click modeling in federated web search (2012) (80)
Understand Short Texts by Harvesting and Analyzing Semantic Knowledge (2017) (79)
Adaptive system anomaly prediction for large-scale hosting infrastructures (2010) (78)
Computing Compressed Multidimensional Skyline Cubes Efficiently (2007) (78)
Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases (2012) (78)
An Improved Biclustering Method for Analyzing Gene Expression Profiles (2005) (78)
Path-tree: An efficient reachability indexing scheme for large directed graphs (2011) (78)
The Trinity Graph Engine (2012) (78)
Finding semantics in time series (2011) (72)
Query Understanding through Knowledge-Based Conceptualization (2015) (71)
Pruning and dynamic scheduling of cost-sensitive ensembles (2002) (70)
Computing term similarity by large probabilistic isA knowledge (2013) (69)
Using SQL to Build New Aggregates and Extenders for Object- Relational Systems (2000) (68)
Attribute extraction and scoring: A probabilistic approach (2013) (68)
A Survey of Algorithms for Keyword Search on Graph Data (2010) (67)
ATLaS: A Native Extension of SQL for Data Mining (2003) (67)
The deductive database system [Lscr ][Dscr ][Lscr ]++ (2002) (66)
Finding global icebergs over distributed data sets (2006) (65)
Managing and mining large graphs: systems and implementations (2012) (65)
Providing freshness guarantees for outsourced databases (2008) (64)
Text Mining in Social Networks (2011) (63)
Loadstar: A Load Shedding Scheme for Classifying Data Streams (2005) (63)
Dual encryption for query integrity assurance (2008) (60)
Mining Data Streams (2005) (59)
Context-Dependent Conceptualization (2013) (59)
Location-Based Spatial Query Processing with Data Sharing in Wireless Broadcast Environments (2008) (58)
Suppressing model overfitting in mining concept-drifting data streams (2006) (58)
On Anomalous Hotspot Discovery in Graph Streams (2013) (58)
Loadstar: Load Shedding in Data Stream Mining (2005) (57)
MapDupReducer: detecting near duplicates over massive datasets (2010) (57)
An online cost sensitive decision-making method in crowdsourcing systems (2013) (55)
An Inference Approach to Basic Level of Categorization (2015) (54)
Stop Chasing Trends: Discovering High Order Models in Evolving Data (2008) (53)
K-Reach: Who is in Your Small World (2012) (50)
Efficient Keyword Search on Uncertain Graph Data (2013) (50)
Supporting ranking and clustering as generalized order-by and group-by (2007) (49)
Compact reachability labeling for graph-structured data (2005) (49)
An algorithmic approach to event summarization (2010) (46)
Toward a Distance Oracle for Billion-Node Graphs (2013) (44)
A fast algorithm for subspace clustering by pattern similarity (2004) (44)
Query Integrity Assurance of Location-Based Services Accessing Outsourced Spatial Databases (2009) (44)
A native extension of SQL for mining data streams (2005) (41)
Adaptive Load Diffusion for Multiway Windowed Stream Joins (2007) (41)
Advances in Web and Network Technologies, and Information Management, APWeb/WAIM 2007 International Workshops: DBMAN 2007, WebETrends 2007, PAIS 2007 and ASWAN 2007, Huang Shan, China, June 16-18, 2007, Proceedings (2007) (41)
A Low-Granularity Classifier for Data Streams with Concept Drifts and Biased Class Distribution (2007) (41)
Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes (2011) (41)
Relational languages and data models for continuous queries on sequences and data streams (2011) (40)
Web scale taxonomy cleansing (2011) (39)
Indexing weighted-sequences in large databases (2003) (39)
Identifying users' topical tasks in web search (2013) (38)
A Sampling-Based Approach to Information Recovery (2008) (38)
Event summarization for system management (2007) (37)
Understanding Short Texts through Semantic Enrichment and Hashing (2016) (37)
Understanding Short Texts (2013) (34)
User defined aggregates in object-relational systems (2000) (34)
Incorporating post-click behaviors into a click model (2010) (33)
Towards a Probabilistic Taxonomy of Many Concepts (2011) (32)
CMP: a fast decision tree classifier using multivariate predictions (2000) (31)
A Bayesian Inference-Based Framework for RFID Data Cleansing (2013) (31)
Logic-Based User-Defined Aggregates for the Next Generation of Database Systems (1999) (31)
A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity (2015) (31)
A query integrity assurance scheme for accessing outsourced spatial databases (2012) (31)
Graph similarity search on large uncertain graph databases (2015) (30)
Mining Concept-Drifting Data Streams (2010) (30)
Clustering by Pattern Similarity (2008) (29)
Open Domain Short Text Conceptualization: A Generative + Descriptive Modeling Approach (2015) (29)
Efficiently Monitoring Top-k Pairs over Sliding Windows (2012) (29)
Graph-Based Wrong IsA Relation Detection in a Large-Scale Lexical Taxonomy (2017) (28)
Automatic Taxonomy Construction from Keywords via Scalable Bayesian Rose Trees (2015) (26)
Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model (2021) (26)
Wikification via link co-occurrence (2013) (26)
Isanette: A Common and Common Sense Knowledge Base for Opinion Mining (2011) (25)
Semantic queries by example (2013) (24)
Nonmonotonic reasoning in LDL (2000) (24)
Automatic extraction of top-k lists from the web (2013) (24)
A Balanced Ensemble Approach to Weighting Classifiers for Text Classification (2006) (24)
Head, modifier, and constraint detection in short texts (2014) (23)
On Conceptual Labeling of a Bag of Words (2015) (22)
Discovery in multi-attribute data with user-defined constraints (2002) (22)
Data-Driven Metaphor Recognition and Explanation (2013) (22)
Empirical comparison of various reinforcement learning strategies for sequential targeted marketing (2002) (22)
On reducing classifier granularity in mining concept-drifting data streams (2005) (21)
The S2-Tree : An Index Structure for Subsequence Matching of Spatial Objects (2001) (21)
Understanding short texts through semantic enrichment and hashing (2016) (21)
Mining Extremely Skewed Trading Anomalies (2004) (21)
Proceedings of the 21st ACM international conference on Information and knowledge management (2012) (20)
Efficiently mining frequent closed partial orders (2005) (20)
Location-based Spatial Queries with Data Sharing in Wireless Broadcast Environments (2007) (20)
User-Defined Aggregates in Database Languages (1999) (20)
An Integrated Data-Driven Framework for Computing System Management (2010) (18)
Asymmetric signature schemes for efficient exact edit similarity query processing (2013) (18)
Hub-Accelerator: Fast and Exact Shortest Path Computation in Large Social Networks (2013) (17)
Shallow Information Extraction for the knowledge Web (2013) (17)
Semantic queries in databases: problems and challenges (2009) (17)
XSeq: an indexing infrastructure for tree pattern queries (2004) (17)
Entity Disambiguation based on a Probabilistic Taxonomy (2011) (17)
Proxies for Shortest Path and Distance Queries (2016) (17)
A Transfer-Learnable Natural Language Interface for Databases (2018) (17)
On the Transitivity of Hypernym-Hyponym Relations in Data-Driven Lexical Taxonomies (2017) (17)
Optimizing index for taxonomy keyword search (2012) (16)
Lock-free consistency control for web 2.0 applications (2008) (16)
Optimizing Timestamp Management in Data Stream Management Systems (2007) (16)
Efficient processing of $$k$$k-hop reachability queries (2014) (16)
A unified approach for computing top-k pairs in multidimensional space (2011) (16)
G-SQL: Fast Query Processing via Graph Exploration (2016) (15)
A system for extracting top-K lists from the web (2012) (15)
Demand-driven frequent itemset mining using pattern structures (2005) (15)
LinkProbe: Probabilistic inference on large-scale social networks (2013) (15)
Toward Topic Search on the Web (2011) (15)
Semantic Data Management: Towards Querying Data with their Meaning (2007) (14)
Probase+: Inferring Missing Links in Conceptual Taxonomies (2017) (14)
Transfer Understanding from Head Queries to Tail Queries (2014) (14)
A Generic Framework for Top-k Pairs and Top-k Objects Queries over Sliding Windows (2012) (13)
FARM: a framework for exploring mining spaces with multiple attributes (2001) (13)
Preference-Based Frequent Pattern Mining (2005) (12)
A fully distributed framework for cost-sensitive data mining (2002) (12)
Querying uncertain data with aggregate constraints (2011) (12)
A Framework for Scalable Cost-sensitive Learning Based on Combing Probabilities and Benefits (2002) (12)
Efficient Computation of Range Aggregates against Uncertain Location-Based Queries (2012) (12)
Learning to rank with a novel kernel perceptron method (2009) (12)
Unifying Data and Domain Knowledge Using Virtual Views (2007) (12)
Employing Semantic Context for Sparse Information Extraction Assessment (2018) (12)
Probase : a Universal Knowledge Base for Semantic Search (2010) (11)
Optimizing content freshness of relations extracted from the web using keyword search (2010) (11)
Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (Extended Abstract) (2018) (11)
Online mining of data streams: applications, techniques and progress (2005) (10)
User Defined Aggregates for Logical Data Languages (1998) (10)
From Intrinsic to Counterfactual: On the Explainability of Contextualized Recommender Systems (2021) (10)
ATLaS: a Turing-Complete Extension of SQL for Data Mining Applications and Streams (2002) (9)
A Generic Framework for Top-${\schmi k}$ Pairs and Top- ${\schmi k}$ Objects Queries over Sliding Windows (2014) (9)
Concept-Based Web Search (2012) (9)
Learning Defining Features for Categories (2016) (9)
Scaling Up Markov Logic Probabilistic Inference for Social Graphs (2017) (8)
User-Defined Aggregates for Datamining (1999) (8)
On dimensionality reduction of massive graphs for indexing and retrieval (2011) (8)
Concept Clustering of Evolving Data (2009) (8)
Database System Extensions for Decision Support: the AXL Approach (2000) (8)
Extending SQL for Decision Support Applications (2002) (8)
Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach (2007) (7)
Adversarial Robustness through Bias Variance Decomposition: A New Perspective for Federated Learning (2020) (7)
Recent progress on selected topics in database research — A report by nine young Chinese researchers working in the United States (2003) (7)
Verb Pattern: A Probabilistic Semantic Representation on Verbs (2016) (7)
Web Scale Entity Resolution using Relational Evidence (2011) (7)
SSDT: a scalable subspace-splitting classifier for biased data (2001) (7)
Syntactic Parsing of Web Queries (2016) (7)
SpatialNLI: A Spatial Domain Natural Language Interface to Databases Using Spatial Comprehension (2019) (7)
An Introduction to Graph Data (2010) (7)
A Random Method for Quantifying Changing Distributions in Data Streams (2005) (7)
Fine-Grained Semantic Conceptualization of FrameNet (2016) (6)
Distance Landmarks Revisited for Road Graphs (2014) (6)
RASIM: a rank-aware separate index method for answering top-k spatial keyword queries (2013) (6)
Overcoming Semantic Drift in Information Extraction (2014) (6)
A Natural Language Interface for Database: Achieving Transfer-learnability Using Adversarial Method for Question Understanding (2020) (6)
Entity Suggestion with Conceptual Expanation (2017) (6)
Weighted Proximity Best-Joins for Information Retrieval (2009) (6)
Pattern-based similarity search for microarray data (2005) (6)
Estimating the Selectivity of XML Path Expression with Predicates by Histograms (2004) (5)
Inverse Time Dependency in Convex Regularized Learning (2009) (5)
Mining associations by pattern structure in large relational tables (2002) (5)
Toward Extensible Spatio-Temporal Databases: An Approach Based on User-Defined Aggregates (2004) (5)
Time-Stamp Management and Query Execution in Data Stream Management Systems (2008) (5)
Link-based hidden attribute discovery for objects on Web (2011) (5)
Fast Relevance Discovery in Time Series (2006) (5)
A Flexible Query Graph Based Model for the Efficient Execution of Continuous Queries (2007) (5)
Proceedings of the first international workshop on Cloud data management (2009) (5)
Reachability Computation in Uncertain Graphs (2011) (4)
Assessing sparse information extraction using semantic contexts (2013) (4)
A System Framework for Web Service Semantic and Automatic Orchestration (2007) (4)
Proxies for Shortest Path and Distance Queries (2017) (4)
Diagnosing and Minimizing Semantic Drift in Iterative Bootstrapping Extraction (2018) (4)
WiiCluster: a Platform for Wikipedia Infobox Generation (2014) (4)
Cleansing uncertain databases leveraging aggregate constraints (2010) (4)
Knowledge Graph and Semantic Computing: Semantic, Knowledge, and Linked Big Data (2016) (4)
Progressive modeling (2002) (4)
Unsupervised Head--Modifier Detection in Search Queries (2016) (3)
G-Index Model: A generic model of index schemes for top-k spatial-keyword queries (2015) (3)
A Unified Framework for Answering k Closest Pairs Queries and Variants (2014) (3)
How to Make a Semantic Network Probabilistic (2014) (3)
Distance Oracle on Billion Node Graphs (2014) (3)
Incompleteness of Database Languages for Data Streams and Data Mining: the Problem and the Cure (2003) (3)
Inferencing in information extraction: Techniques and applications (2015) (3)
ComputingLabel-ConstraintReachabilityinGraph Databases (2010) (3)
User-directed exploration of mining space with multiple attributes (2002) (3)
An index structure for pattern similarity searching in DNA microarray data (2002) (3)
Trinity Graph Engine and its Applications (2017) (3)
Stay Current and Relevant in Data Mining Research (2005) (2)
The Links Have It: Infobox Generation by Summarization over Linked Entities (2014) (2)
Learning Knowledge Bases for Text and Multimedia (2014) (2)
Learning Knowledge Bases for Multimedia in 2015 (2015) (2)
Semantic Bootstrapping: A Theoretical Perspective (2017) (2)
Entity Suggestion by Example using a Conceptual Taxonomy (2015) (2)
Tensor-based Complementary Product Recommendation (2021) (2)
Finding information nebula over large networks (2011) (2)
User-deened Aggregates for Datamining (1999) (2)
ESL: a Very Powerful SQL-Compliant Data Stream Language (2005) (2)
Extending Relational Query Languages for Data Streams (2016) (2)
Modeling and Querying E-Commerce Data in Hybrid Relational-XML DBMSs (2008) (1)
ESL : a Data Stream Query Language and System Designed for Power and Extensibility (2004) (1)
Inductive Learning in Less Than One Sequential Data Scan (2003) (1)
Semantic Bootstrapping: A Theoretical Perspective (2017) (1)
Near-Neighbor Search in Pattern Distance Spaces (2005) (1)
Challenges in Managing and Mining Large, Heterogeneous Data (2011) (1)
SpatialNLI (2019) (1)
Report on the first international workshop on cloud data management (CloudDB 2009) (2010) (1)
Automatic Navbox Generation by Interpretable Clustering over Linked Entities (2017) (1)
Community Search in Dynamic Social Networks (2013) (0)
FastGraph Pattern Matching (2008) (0)
Extraction of Reliable and Actionable Information from Social Media During Emergencies (2022) (0)
Mining Robust Overlapping Co-Clustering in the Presence of Noise (2020) (0)
Proceedings of the APWeb/WAIM 2007 DBMAN, WebETrends, PAIS and ASWAN international workshops on Advances in Web and Network Technologies, and Information Management (2007) (0)
Program committee chairs' welcome (2012) (0)
Compression of Weighted Graphs (0)
Syntactic models for parsing search queries on online social networks (2016) (0)
Network Compression by Node and Edge Mergers (2012) (0)
Proceedings of the First International CIKM Workshop on Cloud Data Management, CloudDB 2009, Hong Kong, China, November 2, 2009 (2009) (0)
Special Issue on Cross-Layer Support for Database Management (2019) (0)
A Short Survey on the User Cold Start Problem in Recommender Systems: Metadata and Meta-Learning Methods (2022) (0)
RASIM: a rank-aware separate index method for answering top-k spatial keyword queries (2012) (0)
Guest Editorial: Special Issue on Managing and Mining Massive Graphs (2015) (0)
Efficient processing of k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-hop reachability queries (2013) (0)
User-Defined Aggregates for Advanced Database Applications (2000) (0)
CloudDB workshop summary (2009) (0)
Query Suggestion by Concept Instantiation (2013) (0)
Letter from the Editor-in-Chief (2001) (0)
A conversation with MSRA researchers (2012) (0)
LOCI: Load Shedding through Class-Preserving Data Acquisition (2006) (0)
Summarization of Weighted Networks (2017) (0)
A Monte Carlo Sampling Framework for Information Recovery (2007) (0)
Dynamic Embedding-based Retrieval for Personalized Item Recommendations at Instacart (2023) (0)
Letter from the Special Issue Editor (2021) (0)
Session details: Special issue on big data analytics workshop (2014) (0)
Rethink e-Commerce Search (2022) (0)
Efficient Tabular Dataset Preparations by the Aggregations in SQL: A Survey (2016) (0)
Computing Term Similarity by Knowledge from Big Data (2013) (0)
Graph similarity search on large uncertain graph databases (2014) (0)
Toward Query-centric Web Modeling and Crawling (2011) (0)
Efficiently Mining Frequent Closed Partial Orders ( Extended Abstract ) (2005) (0)
Research on the improvement of the harmony search algorithm (2015) (0)
The ATLaS system and its powerful database language based on simple extensions of SQL (2002) (0)
Guest Editorial Special Issue on Concept-Level Opinion and Sentiment Analysis (2012) (0)
On Mining Maximal Pattern-Based Clusters (2009) (0)
Theory and Practice of Temporal Data Mining ( TPTDM 2006 ) (2006) (0)
Distributed Big Graph Caching (2013) (0)
G-Index Model: A generic model of index schemes for top-k spatial-keyword queries (2014) (0)
Multilingual spatial domain natural language interface to databases (2023) (0)

This paper list is powered by the following services:

Haixun Wang's Academic­Influence.com Rankings

Haixun Wang's Degrees

Similar Degrees You Can Earn

Why Is Haixun Wang Influential?

Haixun Wang's Published Works

Published Works

Haixun Wang's AcademicInfluence.com Rankings