Devi Parikh

Devi Parikh's AcademicInfluence.com Rankings

Devi Parikh

Engineering

#3360

World Rank

#4428

Historical Rank

Electrical Engineering

#743

World Rank

#808

Historical Rank

engineering Degrees

Devi Parikh

Computer Science

#4360

World Rank

#4593

Historical Rank

Algorithms

#138

World Rank

#140

Historical Rank

Artificial Intelligence

#1042

World Rank

#1061

Historical Rank

Database

#1573

World Rank

#1651

Historical Rank

computer-science Degrees

Download Badge

Engineering
Computer Science

Devi Parikh's Degrees

PhD Electrical and Computer Engineering Carnegie Mellon University
Masters Electrical and Computer Engineering Carnegie Mellon University

Why Is Devi Parikh Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Devi Parikh's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization (2016) (10469)
VQA: Visual Question Answering (2015) (3664)
CIDEr: Consensus-based image description evaluation (2014) (2770)
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks (2019) (2022)
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering (2016) (1560)
Hierarchical Question-Image Co-Attention for Visual Question Answering (2016) (1292)
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization (2016) (1243)
Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning (2016) (1166)
Relative attributes (2011) (951)
Habitat: A Platform for Embodied AI Research (2019) (768)
Visual Dialog (2016) (685)
Joint Unsupervised Learning of Deep Representations and Image Clusters (2016) (665)
Graph R-CNN for Scene Graph Generation (2018) (611)
iCoseg: Interactive co-segmentation with intelligent scribble guidance (2010) (510)
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories (2016) (479)
Embodied Question Answering (2017) (446)
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering (2017) (418)
Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? (2016) (385)
Neural Baby Talk (2018) (362)
12-in-1: Multi-Task Vision and Language Representation Learning (2019) (350)
WhittleSearch: Image search with relative attribute feedback (2012) (335)
Counterfactual Visual Explanations (2019) (315)
Visual Storytelling (2016) (312)
Deal or No Deal? End-to-End Learning of Negotiation Dialogues (2017) (295)
Discovering localized attributes for fine-grained recognition (2012) (293)
ParlAI: A Dialog Research Software Platform (2017) (284)
Yin and Yang: Balancing and Answering Binary Visual Questions (2015) (278)
Grad-CAM: Why did you say that? (2016) (270)
Towards VQA Models That Can Read (2019) (263)
Analyzing the Behavior of Visual Question Answering Models (2016) (256)
Deep Learning the City: Quantifying Urban Perception at a Global Scale (2016) (254)
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames (2019) (247)
Interactively building a discriminative vocabulary of nameable attributes (2011) (240)
RUBi: Reducing Unimodal Biases in Visual Question Answering (2019) (231)
What Makes a Photograph Memorable? (2014) (214)
LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation (2016) (211)
TarMAC: Targeted Multi-Agent Communication (2018) (208)
Bringing Semantics into Focus Using Visual Abstraction (2013) (189)
Understanding the Intrinsic Memorability of Images (2011) (177)
Pythia v0.1: the Winning Entry to the VQA Challenge 2018 (2018) (174)
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded (2019) (169)
Learning the Visual Interpretation of Sentences (2013) (156)
The Open Catalyst 2020 (OC20) Dataset and Community Challenges (2020) (155)
A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories (2016) (153)
nocaps: novel object captioning at scale (2019) (151)
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (2022) (144)
Localization and Segmentation of A 2D High Capacity Color Barcode (2008) (141)
Make-A-Video: Text-to-Video Generation without Text-Video Data (2022) (140)
Visual Coreference Resolution in Visual Dialog using Neural Module Networks (2018) (135)
Attributes for Classifier Feedback (2012) (133)
Context-Aware Captions from Context-Agnostic Supervision (2017) (130)
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web (2020) (125)
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model (2017) (125)
Cycle-Consistency for Robust Visual Question Answering (2019) (118)
Counting Everyday Objects in Everyday Scenes (2016) (116)
Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance (2011) (116)
Predicting Failures of Vision Systems (2014) (113)
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception (2019) (110)
End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features (2018) (101)
Neural Modular Control for Embodied Question Answering (2018) (100)
Talk the Walk: Navigating New York City through Grounded Dialogue (2018) (98)
Audio Visual Scene-Aware Dialog (2019) (96)
Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback (2013) (93)
From appearance to context-based recognition: Dense labeling in small images (2008) (88)
WhittleSearch: Interactive Image Search with Relative Attribute Feedback (2015) (86)
Zero-Shot Learning via Visual Abstraction (2014) (84)
Understanding image virality (2015) (84)
VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes (2015) (82)
Adopting Abstract Images for Semantic Scene Understanding (2016) (80)
Learning Common Sense through Visual Abstraction (2015) (78)
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline (2019) (74)
Leveraging Visual Question Answering for Image-Caption Ranking (2016) (72)
Evaluating Visual Conversational Agents via Cooperative Human-AI Games (2017) (70)
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA (2020) (69)
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog (2019) (66)
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset (2017) (65)
Towards Transparent AI Systems: Interpreting Visual Question Answering Models (2016) (65)
Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks (2015) (65)
Automatic discovery of groups of objects for scene understanding (2012) (63)
The role of features, algorithms and data in visual recognition (2010) (62)
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering (2019) (62)
Visual Dialog (2019) (62)
Do explanations make VQA models more predictable to a human? (2018) (61)
Sort Story: Sorting Jumbled Images and Captions into Stories (2016) (60)
SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation (2019) (60)
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication (2017) (56)
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment (2019) (56)
Spatially Aware Multimodal Transformers for TextVQA (2020) (55)
Pythia-A platform for vision & language research (2018) (54)
Sim-to-Real Transfer for Vision-and-Language Navigation (2020) (53)
Fashion++: Minimal Edits for Outfit Improvement (2019) (52)
Semi-supervised co-training and active learning based approach for multi-view intrusion detection (2009) (51)
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions (2016) (49)
Chasing Ghosts: Instruction Following as Bayesian State Tracking (2019) (48)
Determining Patch Saliency Using Low-Level Context (2008) (48)
Data Fusion and Cost Minimization for Intrusion Detection (2008) (47)
Extracting adaptive contextual cues from unlabeled regions (2011) (47)
Multi-attribute Queries: To Merge or Not to Merge? (2013) (46)
Inference for order reduction in Markov random fields (2011) (45)
Are we pretraining it right? Digging deeper into visio-linguistic pretraining (2020) (45)
Recognizing jumbled images: The role of local and global information in image classification (2011) (44)
Attribute Dominance: What Pops Out? (2013) (44)
We are Humor Beings: Understanding and Predicting Visual Humor (2015) (43)
Embodied Question Answering (2018) (42)
Beyond trees: MRF inference via outer-planar decomposition (2010) (42)
Towards Transparent Systems: Semantic Characterization of Failure Modes (2014) (41)
Unsupervised learning of hierarchical spatial structures in images (2009) (41)
It Takes Two to Tango: Towards Theory of AI's Mind (2017) (40)
Image specificity (2015) (39)
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future (2019) (39)
Finding the weakest link in person detectors (2011) (39)
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer (2022) (39)
Exploring Tiny Images: The Roles of Appearance and Contextual Information for Machine and Human Object Recognition (2012) (39)
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition (2018) (39)
An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage (2020) (37)
Creative Sketch Generation (2020) (37)
Audio Visual Scene-aware dialog (AVSD) Track for Natural Language Generation in DSTC7 (2019) (37)
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs (2021) (36)
Emergence of Compositional Language with Deep Generational Transmission (2019) (36)
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs (2013) (36)
Embodied Amodal Recognition: Learning to Move to Perceive Objects (2019) (36)
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions (2020) (35)
The role of image understanding in contour detection (2012) (35)
AudioGen: Textually Guided Audio Generation (2022) (33)
Measuring Machine Intelligence Through Visual Question Answering (2016) (33)
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation (2020) (33)
Embodied Multimodal Multitask Learning (2019) (31)
Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance (2018) (30)
Interactively Guiding Semi-Supervised Clustering via Attribute-Based Explanations (2014) (30)
Human-Debugging of Machines (2011) (30)
ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations (2021) (29)
Hierarchical Semantics of Objects (hSOs) (2007) (28)
The Hateful Memes Challenge: Competition Report (2020) (27)
Improving Generative Visual Dialog by Answering Diverse Questions (2019) (26)
CoDraw: Visual Dialog for Collaborative Drawing (2017) (26)
Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing (2013) (26)
Dialog System Technology Challenge 7 (2019) (25)
Active Learning for Visual Question Answering: An Empirical Study (2017) (25)
Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7 (2018) (25)
Embodied Visual Recognition (2019) (24)
Seed Image Selection in interactive cosegmentation (2009) (24)
Cooperative Learning with Visual Attributes (2017) (23)
Hierarchical Co-Attention for Visual Question Answering (2016) (23)
Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents (2020) (22)
Sound-Word2Vec: Learning Word Representations Grounded in Sounds (2017) (22)
Relative Attributes for Enhanced Human-Machine Communication (2012) (22)
Cross-channel Communication Networks (2019) (22)
Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding (2016) (22)
Human-Adversarial Visual Question Answering (2021) (21)
Modeling the Long Term Future in Model-Based Reinforcement Learning (2018) (19)
What Makes a Photograph Memorable? (2014) (18)
Where Are You? Localization from Embodied Dialog (2020) (17)
SpaText: Spatio-Textual Representation for Controllable Image Generation (2022) (16)
Feature-based Part Retrieval for Interactive 3D Reassembly (2007) (14)
Contrast and Classify: Training Robust VQA Models (2021) (14)
Decentralized Distributed PPO: Solving PointGoal Navigation (2019) (14)
Implied Feedback: Learning Nuances of User Behavior in Image Search (2013) (13)
Unsupervised Identification of Multiple Objects of Interest from Multiple Images: dISCOVER (2007) (12)
Unsupervised Learning of Hierarchical Semantics of Objects (hSOs) (2007) (12)
Which Edges Matter? (2013) (12)
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration (2022) (11)
Interpreting Visual Question Answering Models (2016) (11)
SQuINTing at VQA Models: Interrogating VQA Models with Sub-Questions (2020) (11)
Object-Centric Diagnosis of Visual Reasoning (2020) (11)
Episodic Memory Question Answering (2022) (9)
Text-To-4D Dynamic Scene Generation (2023) (9)
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data (2020) (9)
Dance2Music: Automatic Dance-driven Music Generation (2021) (8)
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator (2021) (8)
Punny Captions: Witty Wordplay in Image Descriptions (2017) (8)
Predicting User Annoyance Using Visual Attributes (2014) (8)
CRFs for Image Classification (2006) (8)
Feel The Music: Automatically Generating A Dance For An Input Song (2020) (7)
Visual Conceptual Blending with Large-scale Language and Vision Models (2021) (6)
Visual Landmark Selection for Generating Grounded and Interpretable Navigation Instructions (2019) (6)
Interactive Co-segmentation of Objects in Image Collections (2011) (6)
Trick or TReAT : Thematic Reinforcement for Artistic Typography (2019) (6)
Visual Attributes (2017) (6)
Telling Creative Stories Using Generative Visual Aids (2021) (5)
Exploring Crowd Co-creation Scenarios for Sketches (2020) (5)
Collecting Image Description Datasets using Crowdsourcing (2014) (5)
Semantic classification of spacecraft's status: integrating system intelligence and human knowledge (2015) (4)
Neuro-Symbolic Generative Art: A Preliminary Study (2020) (4)
Contrast and Classify: Alternate Training for Robust VQA (2020) (3)
Cutout-search: Putting a name to the picture (2009) (3)
Decentralized Distributed PPO: Mastering PointGoal Navigation (2020) (3)
Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning (2019) (3)
Talk The Walk: Navigating Grids in New York City through Grounded Dialogue (2018) (3)
Visual attributes for enhanced human-machine communication (2013) (3)
Modeling context for image understanding: When, for what, and how? (2009) (2)
The Open Catalyst Challenge 2021: Competition Report (2021) (2)
Unsupervised Modeling of Objects and Their Hierarchical Contextual Interactions (2009) (2)
IR-VIC: Unsupervised Discovery of Sub-goals for Transfer in RL (2019) (2)
Propose and Re-rank Semantic Segmentation via Deep Image Classification (2014) (2)
Bringing diverse classifiers to common grounds: dtransform (2008) (2)
Introduction to Visual Attributes (2017) (2)
Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018) (2019) (2)
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency (2020) (2)
Cross-Task Knowledge Transfer for Visually-Grounded Navigation (2018) (2)
Classification-Error Cost Minimization Strategy: DCMS (2007) (1)
Predicting A Creator's Preferences In, and From, Interactive Generative Art (2020) (1)
Learning Common Sense Through Visual Abstraction Supplementary Material (2015) (1)
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding (2014) (1)
Color Source Separation for Enhanced Pixel Manipulations (2011) (1)
Decentralized Distributed PPO (2019) (1)
Knowing who to listen to: Prioritizing experts from a diverse ensemble for attribute personalization (2016) (1)
Reframing Explanation as an Interactive Medium: The EQUAS (Explainable QUestion Answering System) Project (2021) (1)
An Approach to Interactive Co-segmentation (2011) (1)
Counterfactual reasoning: Do Language Models need world knowledge for causal inference? (2022) (1)
Presented at the Task-Agnostic Reinforcement Learning Workshop at ICLR 2019 U NSUPERVISED D ISCOVERY OF D ECISION S TATES THROUGH I NTRINSIC C ONTROL (2019) (0)
g 3 g 2 g 1 g 0 Answer Grey (2019) (0)
Habitat Sim Generic Dataset Support Habitat API Habitat Platform (2019) (0)
ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition (2022) (0)
Text-Conditional Contextualized Avatars For Zero-Shot Personalization (2023) (0)
Color Source Separation for Enhanced Pixel Manipulations MSR-TR-2011-98 C (2011) (0)
Cloudiness ” Predic ) on : Nameable Predic ) on : Unnameable (2011) (0)
KRISP: Supplemental Material (2021) (0)
ResearchArticle Unsupervised Modeling of Objects and Their Hierarchical (2008) (0)
Lemotif: Affective Visual Journal (2019) (0)
WhittleSearch: Interactive Image Search with Relative Attribute Feedback (2015) (0)
Lemotif: Abstract Visual Depictions of your Emotional States in Life (2019) (0)
Habitat: A Platform for Embodied AI Research Supplemental Materials (2019) (0)
Words, Pictures, and Common Sense (2015) (0)
LANGUAGE MODELS MORE GROUNDED (2019) (0)
THE FUTURE OF AUDIOVISUAL STORY TELLING (2019) (0)
Lemotif: An Affective Visual Journal Using Deep Neural Networks (2019) (0)
ALERT: Predicting Failures (Supplementary Material) (2014) (0)
Video Scene-Aware Dialog Track in DSTC 7 (2018) (0)
Beyond Trees : MAP Inference in MRFs via OuterPlanar Decomposition (2010) (0)
AI-assisted Human creativity (2021) (0)
F ORCE N ET : A G RAPH N EURAL N ETWORK FOR L ARGE -S CALE Q UANTUM C HEMISTRY C ALCULA TIONS (2021) (0)
DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL (2019) (0)
Future of Co-segmentation (2011) (0)
Fashion++: Minimal Edits for Outfit Improvement (Supplementary File) (2019) (0)
Building Bridges: Generative Artworks to Explore AI Ethics (2021) (0)
Contrast and Classify: Training Robust VQA Models (Supplementary) (2021) (0)
Reframing Explanation as an Interactive Medium: The EQUAS (Explainable QUestion Answering System) Project (2021) (0)
Do explanation modalities make VQA models more predictable to a human (2018) (0)

This paper list is powered by the following services:

Devi Parikh's Academic­Influence.com Rankings

Devi Parikh's Degrees

Why Is Devi Parikh Influential?

Devi Parikh's Published Works

Published Works

Devi Parikh's AcademicInfluence.com Rankings