Foster Provost
#60,823
Most Influential Person Now
American academic
Foster Provost's AcademicInfluence.com Rankings
Foster Provostcomputer-science Degrees
Computer Science
#2493
World Rank
#2603
Historical Rank
Information Technology
#36
World Rank
#37
Historical Rank
Data Mining
#55
World Rank
#55
Historical Rank
Machine Learning
#604
World Rank
#611
Historical Rank
Download Badge
Computer Science Communications
Foster Provost's Degrees
- PhD Information Systems New York University
- Masters Computer Science New York University
- Bachelors Computer Science Cornell University
Similar Degrees You Can Earn
Why Is Foster Provost Influential?
(Suggest an Edit or Addition)According to Wikipedia, Foster Provost is an American computer scientist, information systems researcher, and Professor of Data Science and Information Systems and Ira Rennert Professor of Entrepreneurship at New York University's Stern School of Business. He is also the Director for the Data Science and AI Initiative at Stern's Fubon Center for Technology, Business and Innovation. Professor Provost has a Bachelor of Science from Duquesne University in physics and mathematics and a Master of Science and Ph.D. in computer science from the University of Pittsburgh.
Foster Provost's Published Works
Published Works
- Robust Classification for Imprecise Environments (2000) (1323)
- The Case against Accuracy Estimation for Comparing Induction Algorithms (1998) (1224)
- Get another label? improving data quality and data mining using multiple, noisy labelers (2008) (1188)
- Quality management on Amazon Mechanical Turk (2010) (1054)
- Data Science and its Relationship to Big Data and Data-Driven Decision Making (2013) (1032)
- Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction (2003) (949)
- Adaptive Fraud Detection (1997) (905)
- Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions (1997) (888)
- Network-Based Marketing: Identifying Likely Adopters Via Consumer Networks (2006) (595)
- Tree Induction for Probability-Based Ranking (2003) (589)
- Classification in Networked Data: a Toolkit and a Univariate Case Study (2007) (586)
- Activity monitoring: noticing interesting changes in behavior (1999) (493)
- Machine Learning from Imbalanced Data Sets 101 (2008) (447)
- Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking (2013) (421)
- The effect of class distribution on classifier learning: an empirical study (2001) (389)
- Efficient progressive sampling (1999) (387)
- Tree Induction Vs Logistic Regression: A Learning Curve Analysis (2001) (374)
- Handling Missing Values when Applying Classification Models (2007) (356)
- A Simple Relational Classifier (2003) (297)
- Guest Editors' Introduction: On Applied Research in Machine Learning (1998) (294)
- A Survey of Methods for Scaling Up Inductive Algorithms (1999) (276)
- Explaining Data-Driven Document Classifications (2013) (242)
- Combining Data Mining and Machine Learning for Effective User Profiling (1996) (223)
- Active Sampling for Class Probability Estimation and Ranking (2004) (219)
- Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification (2005) (214)
- Repeated labeling using multiple noisy labelers (2012) (176)
- Data science for business (2013) (163)
- Machine learning for targeted display advertising: transfer learning in action (2013) (163)
- Robust Classification Systems for Imprecise Environments (1998) (159)
- Bid optimizing and inventory scoring in targeted online advertising (2012) (154)
- Predictive Modeling With Big Data: Is Bigger Really Better? (2013) (152)
- Audience selection for on-line brand advertising: privacy-friendly social network targeting (2009) (151)
- Scaling Up: Distributed Machine Learning with Cooperation (1996) (145)
- Applications of Data Mining to Electronic Commerce (2000) (132)
- Aggregation-based feature invention and relational concept classes (2003) (131)
- Active Feature-Value Acquisition (2009) (120)
- The myth of the double-blind review?: author identification using only citations (2003) (120)
- Distribution-based aggregation for relational learning with identifier attributes (2006) (116)
- The effect of class distribution on classifier learning (2001) (116)
- RL4: a tool for knowledge-based induction (1990) (113)
- On Applied Research in Machine Learning (1998) (108)
- Mining Massive Fine-Grained Behavior Data to Improve Predictive Analytics (2016) (104)
- Active feature-value acquisition for classifier induction (2004) (102)
- Causally motivated attribution for online advertising (2012) (96)
- Inactive learning?: difficulties employing active learning in practice (2011) (96)
- Why label when you can search?: alternatives to active learning for applying human resources to build classification models under extreme class imbalance (2010) (95)
- Research Commentary - Information in Digital, Economic, and Social Networks (2013) (92)
- Confidence Bands for ROC Curves: Methods and an Empirical Study (2004) (92)
- Discovering Interesting Patterns for Investment Decision Making with GLOWER ☹—A Genetic Learner Overlaid with Entropy Reduction (2000) (92)
- In Pursuit of Enhanced Customer Retention Management: Review, Key Issues, and Future Directions (2017) (91)
- Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” (2015) (88)
- Small Disjuncts in Action: Learning to Diagnose Errors in the Local Loop of the Telephone Network (1993) (87)
- Fraud detection (2002) (84)
- Beat the Machine: Challenging Workers to Find the Unknown Unknowns (2011) (80)
- An expected utility approach to active feature-value acquisition (2005) (78)
- Inductive policy: The pragmatics of bias selection (1995) (78)
- ROC confidence bands: an empirical evaluation (2005) (76)
- Kdd-2001: Proceedings of the Seventh Acm Sigkdd International Conference on Knowledge Discovery and Data Mining : August 26-29, 2001 San Francisco, Ca, USA (2002) (75)
- An Intelligent Assistant for the Knowledge Discovery Process (2001) (73)
- Applications of Data Mining to Electronic Commerce (2000) (68)
- Active Learning for Class Probability Estimation and Ranking (2001) (65)
- AI Approaches to Fraud Detection and Risk Management (1998) (60)
- A Unified Approach to Active Dual Supervision for Labeling Features and Examples (2010) (59)
- Decision-Centric Active Learning of Binary-Outcome Models (2007) (59)
- Distributed Data Mining: Scaling up and beyond (2000) (47)
- Quality-Based Pricing for Crowdsourced Workers (2013) (46)
- Information in Digital, Economic and Social Networks (2012) (45)
- Scalable hands-free transfer learning for online advertising (2014) (44)
- Confidence Bands for Roc Curves (2003) (44)
- Intelligent information triage (2001) (43)
- Economical active feature-value acquisition through Expected Utility estimation (2005) (42)
- Evaluating and Optimizing Online Advertising: Forget the Click, but There Are Good Proxies (2015) (41)
- Suspicion scoring based on guilt-by-association, colle ctive inference, and focused data access 1 (2005) (39)
- Distributed Machine Learning: Scaling Up with Coarse-grained Parallelism (1994) (39)
- Exploiting Background Knowledge in Automated Discovery (1996) (39)
- Design principles of massive, robust prediction systems (2012) (39)
- A Survey of Methods for Scaling Up Inductive Learning Algorithms (1997) (39)
- Machine Learning from Imbalanced Data Sets 101 Extended (39)
- Finding Similar Mobile Consumers with a Privacy-Friendly Geosocial Design (2015) (38)
- Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach (2020) (38)
- Inductive Policy (1992) (37)
- Social Network Collaborative Filtering (2008) (37)
- Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals (2016) (36)
- Suspicion scoring of networked entities based on guilt-by-association, collective inference, and focused data access 1 (2005) (35)
- Counterfactual Explanation Algorithms for Behavioral and Textual Data (2019) (35)
- Towards Intelligent Assistance for a Data Mining Process (2005) (34)
- Discovering Knowledge from Relational Data Extracted from Business News (2002) (33)
- Pseudo-Social Network Targeting from Consumer Transaction Data (2011) (33)
- Intelligent Assistance for the Data Mining Process: an Ontology-Based Approach (2002) (32)
- Corporate residence fraud detection (2014) (31)
- Scaling Up Inductive Algorithms: An Overview (1997) (30)
- Cost-Effective Quality Assurance in Crowd Labeling (2016) (30)
- Handling Missing Values when Applying Classification Models (2007) (29)
- A Brief Survey of Machine Learning Methods for Classification in Networked Data and an Application to Suspicion Scoring (2006) (29)
- Increasing the Efficiency of Data Mining Algorithms with Breadth-First Marker Propagation (1997) (28)
- Acora: Distribution-Based Aggregation for Relational Learning from Identifier Attributes (2005) (27)
- Learning and Inference in Massive Social Networks (2007) (26)
- Distributed Fault Tolerant Embedding of Binary Trees and Rings in Hypercubes (1989) (26)
- Measuring Causal Impact of Online Actions via Natural Experiments: Application to Display Advertising (2015) (26)
- A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: SEDC, LIME-C and SHAP-C (2020) (26)
- Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce (2007) (25)
- Online active inference and learning (2011) (25)
- Scaling Up Inductive Learning with Massive Parallelism (1996) (24)
- Using co-visitation networks for detecting large scale online display advertising exchange fraud (2013) (24)
- Goal-Directed Inductive Learning: Trading off Accuracy for Reduced Error Cost (1994) (24)
- Efficiently Constructing Relational Features from Background Knowledge for Inductive Machine Learning (1994) (23)
- Authors' Response to Gong's, "Comment on Data Science and its Relationship to Big Data and Data-Driven Decision Making" (2014) (22)
- Combining Data Mining and Machine Learning for Effective Fraud Detection (1997) (21)
- Analysis and Visualization of Classifier Performance with Nonuniform Class and Cost Distributions (1997) (20)
- A benchmarking study of classification techniques for behavioral data (2019) (20)
- Pointwise ROC Confidence Bounds: An Empirical Evaluation (2005) (19)
- Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, CA, USA, August 26-29, 2001 (2001) (18)
- Big Data, Data Science, and Civil Rights (2017) (18)
- Social Network Collaborative Filtering: Preliminary Results (2007) (18)
- Iterative Weakening: Optimal and Near-Optimal Policies for the Selection of Search Bias (1993) (18)
- Causal Decision Making and Causal Effect Estimation Are Not the Same... and Why It Matters (2021) (17)
- Relational Learning Problems and Simple Models (2003) (16)
- Special Issue on Applications of data mining to electronic commerce (2001) (14)
- Explaining Classification Models Built on High-Dimensional Sparse Data (2016) (14)
- NetKit-SRL: A Toolkit for Network Learning and Inference (2005) (14)
- Proceedings of the First Workshop on Social Media Analytics (2010) (14)
- Suspicion scoring based on guilt-by-association, collective inference, and focused (2005) (13)
- Learning when training data are costly (2003) (13)
- A Distributed Algorithm for Embedding Trees in Hypercubes with Modifications for Run-Time Fault Tolerance (1992) (12)
- Causal Classification: Treatment Effect Estimation vs. Outcome Prediction (2019) (11)
- Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data (2019) (11)
- Guided Feature Labeling for Budget-Sensitive Learning Under Extreme Class Imbalance (2010) (11)
- Toward economic machine learning and utility-based data mining (2005) (11)
- Simple Models and Classification in Networked Data (2004) (11)
- The Relational Vector-Space Model (2003) (11)
- Information Triage using Prospective Criteria (2001) (10)
- Selective Data Acquisition for Machine Learning Saar-Tsechansky (2011) (10)
- Scaling up inductive learning with massive parallelism (2004) (9)
- Inductive Strengthening: the Effects of a Simple Heuristic for Restricting Hypothesis Space Search (1992) (9)
- Scalable supervised dimensionality reduction using clustering (2013) (9)
- Geo-Social Targeting for Privacy-Friendly Mobile Advertising: Position Paper (2011) (8)
- Counterfactual Explanations for Data-Driven Decisions (2019) (7)
- Explaining Documents' Classifications (2011) (7)
- A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation (2020) (7)
- Wallenius Bayes (2018) (7)
- Telecommunications Network Diagnosis (2008) (6)
- Matrix-Factorization-Based Dimensionality Reduction in the Predictive Modeling Process: A Design Science Perspective (2016) (6)
- Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science (2018) (6)
- Observational vs Experimental Data When Making Automated Decisions Using Machine Learning (2019) (6)
- Combining Observational and Experimental Data to Improve Large-Scale Decision-Making (2020) (5)
- Problem Definition, Data Cleaning, and Evaluation: A Classifier Learning Case Study (1999) (5)
- Tree Induction vs. Logistic Regression for Learning Rankings based on Likelihood of Class Membership (2002) (5)
- Predicting citation rates for physics papers: constructing features for an ordered probit model (2003) (5)
- Viral Marketing: Identifying Likely Adopters Via Consumer Networks (2005) (5)
- Hyperlocal: inferring location of IP addresses in real-time bid requests for mobile ads (2013) (5)
- Proceedings of the ACM SIGKDD Workshop on Human Computation, Paris, France, June 28, 2009 (2009) (4)
- Active Learning for Decision Making (2004) (4)
- Deep Learning on Big, Sparse, Behavioral Data (2019) (4)
- Active Inference and Learning for Classifying Streams (2010) (4)
- ClimBS: searching the bias space (1992) (4)
- Node classification over bipartite graphs through projection (2020) (4)
- A real-time expert system for trigger-logic monitoring (1990) (4)
- Columbus: An Annotated Guide to the Scholarship on His Life and Writings, 1750 to 1988 (1991) (4)
- An exploratory study towards applying and demystifying deep learning classification on behavioral big data (2018) (3)
- Appears in User Modeling 2001 Workshop: Machine Learning, Information Retrieval and User Modeling Information Triage using Prospective Criteria (2001) (3)
- Methods for Individual Treatment Assignment: An Application and Comparison for Playlist Generation (2020) (3)
- Dimensionality Reduction Via Matrix Factorization for Predictive Modeling from Large, Sparse Behavioral Data (2015) (3)
- Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth (2014) (3)
- Aggregation for Predictive Modeling with Relational Data (2005) (3)
- In Pursuit of Enhanced Customer Retention Management: Review, Key Issues, and Future Directions (2017) (3)
- Measuring overlap of data bases in water supply and sanitation using sampling and the binomial probability distribution (1992) (3)
- Finding Mobile Consumers with a Privacy-Friendly Geo-Similarity Network * (2015) (3)
- Machine learning for targeted display advertising: transfer learning in action (2013) (2)
- BINARY TREES AND RINGS IN HYPERCUBES (1989) (2)
- Toward optimal allocation of human resources for active learning withapplication to safe advertising (2009) (2)
- Variance-Based Active Learning (2000) (2)
- Modeling complex networks for electronic commerce (2007) (2)
- Roc Confidence Bands: An Empirical Study (2005) (2)
- Wallenius Naive Bayes (2013) (2)
- Distributed Fault Tolerant Embeddings of Binary Trees in Hypercubes. (1988) (1)
- Data Science for the Real Estate Industry (2020) (1)
- Knowledge Discovery Using Concept-Class Taxonomies (2004) (1)
- The Gift of Gab: Evidence TelE-Commerce Firms Can Profit from Viral Marketing (2005) (1)
- What Managers Need to Know About Big Data (2017) (1)
- E ciently Constructing Relational Features from Background Knowledge for Inductive Machine Learning Also appears in Proceedings AAAI-94 Workshop on Knowledge Discovery in Databases (2007) (1)
- Probability estimation in multi-relational domains (2005) (1)
- Annotated Bibliography of Edmund Spenser, 1937-1960 (1964) (1)
- Societal Impact of Data Science and Artificial Intelligence (2018) (1)
- Iteratively refining SVMs using priors (2015) (1)
- Scaling Up: Distributed earning (1996) (0)
- AAAI-98 Workshops: Reports of the Workshops Held at the Fifteenth National Conference on Artificial Intelligence in Madison, Wisconsin (1999) (0)
- Data-Driven Investment Strategies for Peer-to-Peer Lending (2018) (0)
- In memory of Tom Fawcett (2020) (0)
- ACM SIGKDD 2014 to be Held August 24–27 in Manhattan (2014) (0)
- Data mining tasks and methods: scalability (2002) (0)
- Wallenius Bayes (2018) (0)
- The Phases of Columbus Study (2016) (0)
- Panel: a data scientist's guide to making money from start-ups (2013) (0)
- Node classification over bipartite graphs through projection (2020) (0)
- Modeling Complex Networks Modeling Complex Networks For (Electronic) Commerce For (Electronic) Commerce (2007) (0)
- A MODULAR APPROACH TO RELATIONAL DATA MINING (2002) (0)
- Privacy-sensitive methods, systems, and media for geo-social targeting (2012) (0)
- A benchmarking study of classification techniques for behavioral data (2019) (0)
- Aggregation-Based Feature Invention and Relational (2003) (0)
- ROC Confidence Bands : An Empirical Study 0 (2005) (0)
- The Effects of Confounding When Making Automatic Intervention Decisions Using Machine Learning (2019) (0)
- On Shakespeare's Sonnet 116 (1956) (0)
- 2 Active Learning : Prior Work (2005) (0)
- Columbus Bibliographies: Past, Present, Future (1993) (0)
- William D. Phillips, Jr. and Carla Rahn Phillips. The Worlds of Christopher Columbus. New York: Cambridge University Press. 1992. Pp. 322. $27.95 (1992) (0)
- Monitoring Business Activity (2006) (0)
- Columbus: Dream and Act : A Tragic Suite (1987) (0)
- Ethics and interventions: A commentary on how to “improve” prediction using behavior modification (2022) (0)
- Parallel Continuous Outlier Mining in Streaming Data Parallel Continuous Outlier Mining in Streaming Data (2018) (0)
- Inductive Policy: The Pragmatics of Bias Selection (2004) (0)
- Who's Watching TV? (2016) (0)
- Rejoinder to “Causal Decision Making and Causal Effect Estimation Are Not the Same…and Why It Matters” (2022) (0)
- Brand advertising, on-line audiences, and social media: invited talk (2009) (0)
- A MODULAR APPROACH TO RELATIONALDATA MINING (2002) (0)
- On-line Brand Advertising using Social Networks based on User-generated Content (2008) (0)
- The Predictive Power of Massive Data about our Fine-Grained Behavior (2016) (0)
- Estimating Audience Interest Distribution Based on Audience Web Behavior (2013) (0)
- Iterative Weaken lieies (1993) (0)
- A Data Scientist's Guide to Start-Ups (2014) (0)
- Repeated labeling using multiple noisy labelers (2013) (0)
- ACM SIGKDD 2014 to be Held August 24-27 in Manhattan (2014) (0)
- Attributing value in a data pooling setting for predictive modeling (2017) (0)
- Iteratively refining SVMs (2015) (0)
- Collective Inference for Consumer Networks (2007) (0)
- Parallel Continuous Outlier Mining in Streaming Data (2018) (0)
- Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data (2019) (0)
- Edmund Spenser: An Annotated Bibliography 1937-72 (1979) (0)
- Industry: telecommunications network diagnosis (2002) (0)
- Active modeling in cost-sensitive environments (2002) (0)
This paper list is powered by the following services:
Other Resources About Foster Provost
What Schools Are Affiliated With Foster Provost?
Foster Provost is affiliated with the following schools: