Devi Parikh
#113,497
Most Influential Person Now
Devi Parikh's AcademicInfluence.com Rankings
Devi Parikhengineering Degrees
Engineering
#3360
World Rank
#4428
Historical Rank
Electrical Engineering
#743
World Rank
#808
Historical Rank

Devi Parikhcomputer-science Degrees
Computer Science
#4360
World Rank
#4593
Historical Rank
Algorithms
#138
World Rank
#140
Historical Rank
Artificial Intelligence
#1042
World Rank
#1061
Historical Rank
Database
#1573
World Rank
#1651
Historical Rank

Download Badge
Engineering Computer Science
Devi Parikh's Degrees
- PhD Electrical and Computer Engineering Carnegie Mellon University
- Masters Electrical and Computer Engineering Carnegie Mellon University
Why Is Devi Parikh Influential?
(Suggest an Edit or Addition)Devi Parikh's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization (2016) (10469)
- VQA: Visual Question Answering (2015) (3664)
- CIDEr: Consensus-based image description evaluation (2014) (2770)
- ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks (2019) (2022)
- Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering (2016) (1560)
- Hierarchical Question-Image Co-Attention for Visual Question Answering (2016) (1292)
- Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization (2016) (1243)
- Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning (2016) (1166)
- Relative attributes (2011) (951)
- Habitat: A Platform for Embodied AI Research (2019) (768)
- Visual Dialog (2016) (685)
- Joint Unsupervised Learning of Deep Representations and Image Clusters (2016) (665)
- Graph R-CNN for Scene Graph Generation (2018) (611)
- iCoseg: Interactive co-segmentation with intelligent scribble guidance (2010) (510)
- A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories (2016) (479)
- Embodied Question Answering (2017) (446)
- Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering (2017) (418)
- Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? (2016) (385)
- Neural Baby Talk (2018) (362)
- 12-in-1: Multi-Task Vision and Language Representation Learning (2019) (350)
- WhittleSearch: Image search with relative attribute feedback (2012) (335)
- Counterfactual Visual Explanations (2019) (315)
- Visual Storytelling (2016) (312)
- Deal or No Deal? End-to-End Learning of Negotiation Dialogues (2017) (295)
- Discovering localized attributes for fine-grained recognition (2012) (293)
- ParlAI: A Dialog Research Software Platform (2017) (284)
- Yin and Yang: Balancing and Answering Binary Visual Questions (2015) (278)
- Grad-CAM: Why did you say that? (2016) (270)
- Towards VQA Models That Can Read (2019) (263)
- Analyzing the Behavior of Visual Question Answering Models (2016) (256)
- Deep Learning the City: Quantifying Urban Perception at a Global Scale (2016) (254)
- DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames (2019) (247)
- Interactively building a discriminative vocabulary of nameable attributes (2011) (240)
- RUBi: Reducing Unimodal Biases in Visual Question Answering (2019) (231)
- What Makes a Photograph Memorable? (2014) (214)
- LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation (2016) (211)
- TarMAC: Targeted Multi-Agent Communication (2018) (208)
- Bringing Semantics into Focus Using Visual Abstraction (2013) (189)
- Understanding the Intrinsic Memorability of Images (2011) (177)
- Pythia v0.1: the Winning Entry to the VQA Challenge 2018 (2018) (174)
- Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded (2019) (169)
- Learning the Visual Interpretation of Sentences (2013) (156)
- The Open Catalyst 2020 (OC20) Dataset and Community Challenges (2020) (155)
- A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories (2016) (153)
- nocaps: novel object captioning at scale (2019) (151)
- Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (2022) (144)
- Localization and Segmentation of A 2D High Capacity Color Barcode (2008) (141)
- Make-A-Video: Text-to-Video Generation without Text-Video Data (2022) (140)
- Visual Coreference Resolution in Visual Dialog using Neural Module Networks (2018) (135)
- Attributes for Classifier Feedback (2012) (133)
- Context-Aware Captions from Context-Agnostic Supervision (2017) (130)
- Improving Vision-and-Language Navigation with Image-Text Pairs from the Web (2020) (125)
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model (2017) (125)
- Cycle-Consistency for Robust Visual Question Answering (2019) (118)
- Counting Everyday Objects in Everyday Scenes (2016) (116)
- Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance (2011) (116)
- Predicting Failures of Vision Systems (2014) (113)
- Embodied Question Answering in Photorealistic Environments With Point Cloud Perception (2019) (110)
- End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features (2018) (101)
- Neural Modular Control for Embodied Question Answering (2018) (100)
- Talk the Walk: Navigating New York City through Grounded Dialogue (2018) (98)
- Audio Visual Scene-Aware Dialog (2019) (96)
- Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback (2013) (93)
- From appearance to context-based recognition: Dense labeling in small images (2008) (88)
- WhittleSearch: Interactive Image Search with Relative Attribute Feedback (2015) (86)
- Zero-Shot Learning via Visual Abstraction (2014) (84)
- Understanding image virality (2015) (84)
- VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes (2015) (82)
- Adopting Abstract Images for Semantic Scene Understanding (2016) (80)
- Learning Common Sense through Visual Abstraction (2015) (78)
- Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline (2019) (74)
- Leveraging Visual Question Answering for Image-Caption Ranking (2016) (72)
- Evaluating Visual Conversational Agents via Cooperative Human-AI Games (2017) (70)
- KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA (2020) (69)
- CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog (2019) (66)
- C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset (2017) (65)
- Towards Transparent AI Systems: Interpreting Visual Question Answering Models (2016) (65)
- Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks (2015) (65)
- Automatic discovery of groups of objects for scene understanding (2012) (63)
- The role of features, algorithms and data in visual recognition (2010) (62)
- Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering (2019) (62)
- Visual Dialog (2019) (62)
- Do explanations make VQA models more predictable to a human? (2018) (61)
- Sort Story: Sorting Jumbled Images and Captions into Stories (2016) (60)
- SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation (2019) (60)
- CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication (2017) (56)
- Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment (2019) (56)
- Spatially Aware Multimodal Transformers for TextVQA (2020) (55)
- Pythia-A platform for vision & language research (2018) (54)
- Sim-to-Real Transfer for Vision-and-Language Navigation (2020) (53)
- Fashion++: Minimal Edits for Outfit Improvement (2019) (52)
- Semi-supervised co-training and active learning based approach for multi-view intrusion detection (2009) (51)
- Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions (2016) (49)
- Chasing Ghosts: Instruction Following as Bayesian State Tracking (2019) (48)
- Determining Patch Saliency Using Low-Level Context (2008) (48)
- Data Fusion and Cost Minimization for Intrusion Detection (2008) (47)
- Extracting adaptive contextual cues from unlabeled regions (2011) (47)
- Multi-attribute Queries: To Merge or Not to Merge? (2013) (46)
- Inference for order reduction in Markov random fields (2011) (45)
- Are we pretraining it right? Digging deeper into visio-linguistic pretraining (2020) (45)
- Recognizing jumbled images: The role of local and global information in image classification (2011) (44)
- Attribute Dominance: What Pops Out? (2013) (44)
- We are Humor Beings: Understanding and Predicting Visual Humor (2015) (43)
- Embodied Question Answering (2018) (42)
- Beyond trees: MRF inference via outer-planar decomposition (2010) (42)
- Towards Transparent Systems: Semantic Characterization of Failure Modes (2014) (41)
- Unsupervised learning of hierarchical spatial structures in images (2009) (41)
- It Takes Two to Tango: Towards Theory of AI's Mind (2017) (40)
- Image specificity (2015) (39)
- Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future (2019) (39)
- Finding the weakest link in person detectors (2011) (39)
- Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer (2022) (39)
- Exploring Tiny Images: The Roles of Appearance and Contextual Information for Machine and Human Object Recognition (2012) (39)
- Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition (2018) (39)
- An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage (2020) (37)
- Creative Sketch Generation (2020) (37)
- Audio Visual Scene-aware dialog (AVSD) Track for Natural Language Generation in DSTC7 (2019) (37)
- VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs (2021) (36)
- Emergence of Compositional Language with Deep Generational Transmission (2019) (36)
- Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs (2013) (36)
- Embodied Amodal Recognition: Learning to Move to Perceive Objects (2019) (36)
- SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions (2020) (35)
- The role of image understanding in contour detection (2012) (35)
- AudioGen: Textually Guided Audio Generation (2022) (33)
- Measuring Machine Intelligence Through Visual Question Answering (2016) (33)
- Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation (2020) (33)
- Embodied Multimodal Multitask Learning (2019) (31)
- Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance (2018) (30)
- Interactively Guiding Semi-Supervised Clustering via Attribute-Based Explanations (2014) (30)
- Human-Debugging of Machines (2011) (30)
- ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations (2021) (29)
- Hierarchical Semantics of Objects (hSOs) (2007) (28)
- The Hateful Memes Challenge: Competition Report (2020) (27)
- Improving Generative Visual Dialog by Answering Diverse Questions (2019) (26)
- CoDraw: Visual Dialog for Collaborative Drawing (2017) (26)
- Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing (2013) (26)
- Dialog System Technology Challenge 7 (2019) (25)
- Active Learning for Visual Question Answering: An Empirical Study (2017) (25)
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7 (2018) (25)
- Embodied Visual Recognition (2019) (24)
- Seed Image Selection in interactive cosegmentation (2009) (24)
- Cooperative Learning with Visual Attributes (2017) (23)
- Hierarchical Co-Attention for Visual Question Answering (2016) (23)
- Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents (2020) (22)
- Sound-Word2Vec: Learning Word Representations Grounded in Sounds (2017) (22)
- Relative Attributes for Enhanced Human-Machine Communication (2012) (22)
- Cross-channel Communication Networks (2019) (22)
- Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding (2016) (22)
- Human-Adversarial Visual Question Answering (2021) (21)
- Modeling the Long Term Future in Model-Based Reinforcement Learning (2018) (19)
- What Makes a Photograph Memorable? (2014) (18)
- Where Are You? Localization from Embodied Dialog (2020) (17)
- SpaText: Spatio-Textual Representation for Controllable Image Generation (2022) (16)
- Feature-based Part Retrieval for Interactive 3D Reassembly (2007) (14)
- Contrast and Classify: Training Robust VQA Models (2021) (14)
- Decentralized Distributed PPO: Solving PointGoal Navigation (2019) (14)
- Implied Feedback: Learning Nuances of User Behavior in Image Search (2013) (13)
- Unsupervised Identification of Multiple Objects of Interest from Multiple Images: dISCOVER (2007) (12)
- Unsupervised Learning of Hierarchical Semantics of Objects (hSOs) (2007) (12)
- Which Edges Matter? (2013) (12)
- MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration (2022) (11)
- Interpreting Visual Question Answering Models (2016) (11)
- SQuINTing at VQA Models: Interrogating VQA Models with Sub-Questions (2020) (11)
- Object-Centric Diagnosis of Visual Reasoning (2020) (11)
- Episodic Memory Question Answering (2022) (9)
- Text-To-4D Dynamic Scene Generation (2023) (9)
- Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data (2020) (9)
- Dance2Music: Automatic Dance-driven Music Generation (2021) (8)
- VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator (2021) (8)
- Punny Captions: Witty Wordplay in Image Descriptions (2017) (8)
- Predicting User Annoyance Using Visual Attributes (2014) (8)
- CRFs for Image Classification (2006) (8)
- Feel The Music: Automatically Generating A Dance For An Input Song (2020) (7)
- Visual Conceptual Blending with Large-scale Language and Vision Models (2021) (6)
- Visual Landmark Selection for Generating Grounded and Interpretable Navigation Instructions (2019) (6)
- Interactive Co-segmentation of Objects in Image Collections (2011) (6)
- Trick or TReAT : Thematic Reinforcement for Artistic Typography (2019) (6)
- Visual Attributes (2017) (6)
- Telling Creative Stories Using Generative Visual Aids (2021) (5)
- Exploring Crowd Co-creation Scenarios for Sketches (2020) (5)
- Collecting Image Description Datasets using Crowdsourcing (2014) (5)
- Semantic classification of spacecraft's status: integrating system intelligence and human knowledge (2015) (4)
- Neuro-Symbolic Generative Art: A Preliminary Study (2020) (4)
- Contrast and Classify: Alternate Training for Robust VQA (2020) (3)
- Cutout-search: Putting a name to the picture (2009) (3)
- Decentralized Distributed PPO: Mastering PointGoal Navigation (2020) (3)
- Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning (2019) (3)
- Talk The Walk: Navigating Grids in New York City through Grounded Dialogue (2018) (3)
- Visual attributes for enhanced human-machine communication (2013) (3)
- Modeling context for image understanding: When, for what, and how? (2009) (2)
- The Open Catalyst Challenge 2021: Competition Report (2021) (2)
- Unsupervised Modeling of Objects and Their Hierarchical Contextual Interactions (2009) (2)
- IR-VIC: Unsupervised Discovery of Sub-goals for Transfer in RL (2019) (2)
- Propose and Re-rank Semantic Segmentation via Deep Image Classification (2014) (2)
- Bringing diverse classifiers to common grounds: dtransform (2008) (2)
- Introduction to Visual Attributes (2017) (2)
- Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018) (2019) (2)
- SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency (2020) (2)
- Cross-Task Knowledge Transfer for Visually-Grounded Navigation (2018) (2)
- Classification-Error Cost Minimization Strategy: DCMS (2007) (1)
- Predicting A Creator's Preferences In, and From, Interactive Generative Art (2020) (1)
- Learning Common Sense Through Visual Abstraction Supplementary Material (2015) (1)
- Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding (2014) (1)
- Color Source Separation for Enhanced Pixel Manipulations (2011) (1)
- Decentralized Distributed PPO (2019) (1)
- Knowing who to listen to: Prioritizing experts from a diverse ensemble for attribute personalization (2016) (1)
- Reframing Explanation as an Interactive Medium: The EQUAS (Explainable QUestion Answering System) Project (2021) (1)
- An Approach to Interactive Co-segmentation (2011) (1)
- Counterfactual reasoning: Do Language Models need world knowledge for causal inference? (2022) (1)
- Presented at the Task-Agnostic Reinforcement Learning Workshop at ICLR 2019 U NSUPERVISED D ISCOVERY OF D ECISION S TATES THROUGH I NTRINSIC C ONTROL (2019) (0)
- g 3 g 2 g 1 g 0 Answer Grey (2019) (0)
- Habitat Sim Generic Dataset Support Habitat API Habitat Platform (2019) (0)
- ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition (2022) (0)
- Text-Conditional Contextualized Avatars For Zero-Shot Personalization (2023) (0)
- Color Source Separation for Enhanced Pixel Manipulations MSR-TR-2011-98 C (2011) (0)
- Cloudiness ” Predic ) on : Nameable Predic ) on : Unnameable (2011) (0)
- KRISP: Supplemental Material (2021) (0)
- ResearchArticle Unsupervised Modeling of Objects and Their Hierarchical (2008) (0)
- Lemotif: Affective Visual Journal (2019) (0)
- WhittleSearch: Interactive Image Search with Relative Attribute Feedback (2015) (0)
- Lemotif: Abstract Visual Depictions of your Emotional States in Life (2019) (0)
- Habitat: A Platform for Embodied AI Research Supplemental Materials (2019) (0)
- Words, Pictures, and Common Sense (2015) (0)
- LANGUAGE MODELS MORE GROUNDED (2019) (0)
- THE FUTURE OF AUDIOVISUAL STORY TELLING (2019) (0)
- Lemotif: An Affective Visual Journal Using Deep Neural Networks (2019) (0)
- ALERT: Predicting Failures (Supplementary Material) (2014) (0)
- Video Scene-Aware Dialog Track in DSTC 7 (2018) (0)
- Beyond Trees : MAP Inference in MRFs via OuterPlanar Decomposition (2010) (0)
- AI-assisted Human creativity (2021) (0)
- F ORCE N ET : A G RAPH N EURAL N ETWORK FOR L ARGE -S CALE Q UANTUM C HEMISTRY C ALCULA TIONS (2021) (0)
- DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL (2019) (0)
- Future of Co-segmentation (2011) (0)
- Fashion++: Minimal Edits for Outfit Improvement (Supplementary File) (2019) (0)
- Building Bridges: Generative Artworks to Explore AI Ethics (2021) (0)
- Contrast and Classify: Training Robust VQA Models (Supplementary) (2021) (0)
- Reframing Explanation as an Interactive Medium: The EQUAS (Explainable QUestion Answering System) Project (2021) (0)
- Do explanation modalities make VQA models more predictable to a human (2018) (0)
This paper list is powered by the following services: