Abhinav Kumar Gupta
#112,604
Most Influential Person Now
Abhinav Kumar Gupta's AcademicInfluence.com Rankings
Abhinav Kumar Guptacomputer-science Degrees
Computer Science
#4286
World Rank
#4511
Historical Rank
Algorithms
#134
World Rank
#136
Historical Rank
Machine Learning
#797
World Rank
#808
Historical Rank
Artificial Intelligence
#1006
World Rank
#1024
Historical Rank

Download Badge
Computer Science
Abhinav Kumar Gupta's Degrees
- PhD Computer Science Stanford University
- Masters Computer Science Stanford University
Similar Degrees You Can Earn
Why Is Abhinav Kumar Gupta Influential?
(Suggest an Edit or Addition)Abhinav Kumar Gupta's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Non-local Neural Networks (2017) (5920)
- Unsupervised Visual Representation Learning by Context Prediction (2015) (2226)
- Training Region-Based Object Detectors with Online Hard Example Mining (2016) (1921)
- Revisiting Unreasonable Effectiveness of Data in Deep Learning Era (2017) (1672)
- Target-driven visual navigation in indoor scenes using deep reinforcement learning (2016) (1175)
- Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours (2015) (987)
- Ensemble of exemplar-SVMs for object detection and beyond (2011) (940)
- Cross-Stitch Networks for Multi-task Learning (2016) (893)
- Never-Ending Learning (2012) (870)
- Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding (2016) (858)
- The Visual Object Tracking VOT2016 Challenge Results (2016) (702)
- What makes Paris look like Paris? (2012) (673)
- Learning a Predictable and Generative Vector Representation for Objects (2016) (621)
- AI2-THOR: An Interactive 3D Environment for Visual AI (2017) (615)
- Videos as Space-Time Region Graphs (2018) (603)
- Unsupervised Discovery of Mid-Level Discriminative Patches (2012) (583)
- Generative Image Modeling Using Style and Structure Adversarial Networks (2016) (576)
- Robust Adversarial Reinforcement Learning (2017) (565)
- Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition (2009) (524)
- A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection (2017) (491)
- An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders (2016) (478)
- Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs (2018) (464)
- NEIL: Extracting Visual Knowledge from Web Data (2013) (462)
- Learning from Noisy Large-Scale Datasets with Minimal Supervision (2017) (397)
- ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification (2017) (394)
- Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics (2010) (378)
- The Visual Object Tracking VOT2015 Challenge Results (2018) (356)
- Scaling and Benchmarking Self-Supervised Visual Representation Learning (2019) (322)
- Transferring Rich Feature Hierarchies for Robust Visual Tracking (2015) (317)
- Designing deep networks for surface normal estimation (2014) (315)
- Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces (2010) (309)
- Webly Supervised Learning of Convolutional Networks (2015) (306)
- The Pose Knows: Video Forecasting by Generating Pose Futures (2017) (300)
- Beyond Skip Connections: Top-Down Modulation for Object Detection (2016) (298)
- The More You Know: Using Knowledge Graphs for Image Classification (2016) (286)
- Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos (2009) (283)
- Data-driven visual similarity for cross-domain image matching (2011) (283)
- Mid-level Visual Element Discovery as Discriminative Mode Seeking (2013) (275)
- Objects in Action: An Approach for Combining Action Understanding and Object Perception (2007) (267)
- Learning to Explore using Active Neural SLAM (2020) (259)
- From 3D scene geometry to human workspace (2011) (257)
- Self-Supervised Exploration via Disagreement (2019) (248)
- Learning to fly by crashing (2017) (243)
- Unsupervised Learning of Visual Representations Using Videos (2015) (230)
- Object Goal Navigation using Goal-Oriented Semantic Exploration (2020) (225)
- Visual Semantic Navigation using Scene Priors (2018) (216)
- Actions ~ Transformations (2015) (215)
- Marr Revisited: 2D-3D Alignment via Surface Normal Prediction (2016) (212)
- Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers (2008) (204)
- Patch to the Future: Unsupervised Visual Prediction (2014) (197)
- Dense Optical Flow Prediction from a Static Image (2015) (194)
- Iterative Visual Reasoning Beyond Convolutions (2018) (184)
- Data-Driven 3D Primitives for Single Image Understanding (2013) (174)
- The Curious Robot: Learning Visual Representations via Physical Interactions (2016) (173)
- People Watching: Human Actions as a Cue for Single View Geometry (2012) (172)
- From Red Wine to Red Tomato: Composition with Context (2017) (170)
- Transitive Invariance for Self-Supervised Visual Representation Learning (2017) (168)
- Neural Topological SLAM for Visual Navigation (2020) (159)
- Asynchronous Temporal Fields for Action Recognition (2016) (159)
- Representing Videos Using Mid-level Discriminative Patches (2013) (157)
- Learning Exploration Policies for Navigation (2019) (150)
- Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases (2020) (146)
- Contextual Priming and Feedback for Faster R-CNN (2016) (143)
- Spatial Memory for Context Reasoning in Object Detection (2017) (137)
- Beyond Grids: Learning Graph Representations for Visual Recognition (2018) (136)
- An Implementation of Faster RCNN with Study for Region Sampling (2017) (132)
- Visual Semantic Planning Using Deep Successor Representations (2017) (126)
- What Actions are Needed for Understanding Human Actions in Videos? (2017) (121)
- Enriching Visual Knowledge Bases via Object Discovery and Segmentation (2014) (120)
- Scene Semantics from Long-Term Observation of People (2012) (115)
- "What Happens If..." Learning to Predict the Effect of Forces in Images (2016) (115)
- PixelNet: Representation of the pixels, by the pixels, and for the pixels (2017) (111)
- Compositional Learning for Human Object Interaction (2018) (111)
- Learning to push by grasping: Using multiple tasks for effective learning (2016) (110)
- Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias (2018) (106)
- Task-Driven Modular Networks for Zero-Shot Compositional Learning (2019) (105)
- ClusterFit: Improving Generalization of Visual Representations (2019) (96)
- Constrained Semi-Supervised Learning Using Attributes and Comparative Attributes (2012) (94)
- Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos (2018) (92)
- Actor and Observer: Joint Modeling of First and Third-Person Videos (2018) (91)
- Unfolding an Indoor Origami World (2014) (90)
- Learning 6-DOF Grasping Interaction via Deep Geometry-Aware 3D Representations (2017) (89)
- Beyond active noun tagging: Modeling contextual interactions for multi-class active learning (2010) (86)
- PyRobot: An Open-source Robotics Framework for Research and Benchmarking (2019) (86)
- Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection (2017) (82)
- Canonical Surface Mapping via Geometric Cycle Consistency (2019) (80)
- BOLD5000, a public fMRI dataset while viewing 5000 visual images (2019) (75)
- Supervision via competition: Robot adversaries for learning tasks (2016) (73)
- The Visual Object Tracking VOT 2016 Challenge Results (2018) (73)
- KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA (2020) (69)
- Articulation-Aware Canonical Surface Mapping (2020) (67)
- Non-linear Dimensionality Reduction by Locally Linear Isomaps (2004) (67)
- Multiple Interactions Made Easy (MIME): Large Scale Demonstrations Data for Imitation (2018) (66)
- Compositional Video Prediction (2019) (66)
- Constraint Integration for Efficient Multiview Pose Estimation with Self-Occlusions (2008) (65)
- Context and observation driven latent variable model for human pose estimation (2008) (64)
- Learning by Asking Questions (2017) (63)
- Hardware Conditioned Policies for Multi-Robot Transfer Learning (2018) (60)
- Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller (2019) (59)
- Binge Watching: Scaling Affordance Learning from Sitcoms (2017) (58)
- Environment Probing Interaction Policies (2019) (57)
- PixelNet: Towards a General Pixel-level Architecture (2016) (55)
- Where2Act: From Pixels to Actions for Articulated 3D Objects (2021) (54)
- The Unsurprising Effectiveness of Pre-Trained Vision Models for Control (2022) (52)
- Interpretable Intuitive Physics Model (2018) (52)
- Learning to Grasp Without Seeing (2018) (50)
- Visual Imitation Made Easy (2020) (50)
- Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning (2020) (50)
- Context as Supervisory Signal: Discovering Objects with Predictable Context (2014) (44)
- Learning Robot Skills with Temporal Variational Inference (2020) (43)
- Shelf-Supervised Mesh Prediction in the Wild (2021) (42)
- Building Part-Based Object Detectors via 3D Geometry (2013) (42)
- See, Hear, Explore: Curiosity via Audio-Visual Association (2020) (40)
- Implicit Mesh Reconstruction from Unannotated Image Collections (2020) (40)
- Discovering Motor Programs by Recomposing Demonstrations (2020) (39)
- Learning What and How of Contextual Models for Scene Labeling (2010) (38)
- What makes Paris look like Paris? (2015) (38)
- Transformers for One-Shot Visual Imitation (2020) (37)
- Learning Visual Storylines with Skipping Recurrent Neural Networks (2016) (37)
- COST: An Approach for Camera Selection and Multi-Object Inference Ordering in Dynamic Scenes (2007) (36)
- Semantic Curiosity for Active Visual Learning (2020) (35)
- BBNVISER : BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems (2012) (35)
- Neural Dynamic Policies for End-to-End Sensorimotor Learning (2020) (34)
- 3D-RelNet: Joint Object and Relational Network for 3D Prediction (2019) (34)
- Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects (2020) (33)
- Efficient Bimanual Manipulation Using Learned Task Schemas (2019) (31)
- CASSL: Curriculum Accelerated Self-Supervised Learning (2017) (31)
- Sense discovery via co-clustering on images and text (2015) (30)
- Object-centric Forward Modeling for Model Predictive Control (2019) (29)
- ReSkin: versatile, replaceable, lasting tactile skins (2021) (28)
- BOLD5000: A public fMRI dataset of 5000 images (2018) (28)
- Dynamics-aware Embeddings (2019) (27)
- Learning To Explore Using Active Neural Mapping (2020) (27)
- Piecing together the segmentation jigsaw using context (2011) (25)
- 3D Shape Attributes (2016) (24)
- Much Ado About Time: Exhaustive Annotation of Temporal Data (2016) (24)
- Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning (2020) (23)
- Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping (2020) (22)
- Single Image 3D without a Single 3D Image (2015) (21)
- Pose from Action: Unsupervised Learning of Pose Features based on Motion (2016) (21)
- Swoosh! Rattle! Thump! - Actions that Sound (2020) (21)
- Wanderlust: Online Continual Object Detection in the Real World (2021) (19)
- Learn-to-Race: A Multimodal Control Environment for Autonomous Racing (2021) (19)
- CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents (2021) (19)
- Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces (2019) (19)
- Audio-Visual Floorplan Reconstruction (2020) (19)
- Aligning Videos in Space and Time (2020) (18)
- What's in a Question: Using Visual Questions as a Form of Supervision (2017) (18)
- Interesting Object, Curious Agent: Learning Task-Agnostic Exploration (2021) (18)
- In Defense of the Direct Perception of Affordances (2015) (17)
- Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies (2019) (17)
- WebVision Challenge: Visual Learning and Understanding With Web Data (2017) (17)
- HASVRec: A modularized Hierarchical Attention-based Scholarly Venue Recommender system (2020) (16)
- Intrinsic Motivation for Encouraging Synergistic Behavior (2020) (15)
- Exemplar-SVMs for Visual Ob ject Detection, Label Transfer and Image Retrieval (2012) (15)
- Extracting regions of symmetry (2005) (15)
- What's in your hands? 3D Reconstruction of Generic Objects in Hands (2022) (13)
- Mid-level Elements for Object Detection (2015) (12)
- Hierarchical Neural Dynamic Policies (2021) (10)
- Applying artificial vision models to human scene understanding (2015) (10)
- A "Shape Aware" Model for semi-supervised Learning of Objects and its Context (2008) (9)
- Pretrain, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction (2022) (9)
- Multi-Contrast Convolution Neural Network and Fast Feature Embedding for Multi-Class Tyre Defect Detection (2020) (8)
- From Images to 3D Shape Attributes (2016) (8)
- Modular Visual Navigation using Active Neural Mapping (2019) (8)
- Constraint Integration for Multiview Pose Estimation of Humans with Self-Occlusions (2006) (7)
- Learning Grasping Interaction with Geometry-aware 3D Representations (2017) (7)
- Beyond Nouns and Verbs (2009) (7)
- PixelTransformer: Sample Conditioned Signal Generation (2021) (6)
- Understanding Higher-Order Shape via 3D Shape Attributes (2016) (5)
- Learning State-Aware Visual Representations from Audible Interactions (2022) (5)
- The Functional Correspondence Problem (2021) (4)
- DeepMPCVS: Deep Model Predictive Control for Visual Servoing (2021) (4)
- Analyzing structural priors in multi-agent communication (2020) (4)
- Cutting through the clutter: Task-relevant features for image matching (2016) (3)
- Scaling Up Neural Datasets: A public fMRI dataset of 5000 scenes (2018) (3)
- Robust control design for inverted pendulum system with uncertain disturbances (2016) (3)
- Supervoxel Attention Graphs for Long-Range Video Modeling (2021) (3)
- Beyond the Camera: Neural Networks in World Coordinates (2020) (2)
- droidlet: modular, heterogenous, multi-modal agents (2021) (2)
- A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation (2021) (2)
- Supersizing Self-Supervision: Learning Perception and Action Without Human Supervision (2016) (2)
- Watermarking of MPEG-4 Videos (2004) (2)
- Item recommender system by incorporating metadata information into ternary semantic analysis (2012) (1)
- Comparative Study of Optimization Techniques for Tuning of PI Gains for Greenhouse Climate Control (2019) (1)
- People Watching: Human Actions as a Cue for Single View Geometry (2014) (1)
- Beyond Games: Bringing Exploration to Robots in Real-world (2018) (1)
- Compositional Video Prediction (Supplementary) (2019) (0)
- ICLR 2019 PPO Egocentric Fine Map Egocentric Coarse Map RGB Image Compute Reward ! " (2019) (0)
- 2 SIFT Feature Based Object Detection (2005) (0)
- Towards a model for mid-level feature representation of scenes (2014) (0)
- Scene-Space Encoding within the Functional Scene-Selective Network. (2015) (0)
- RB2: Robotics Benchmarking with a Twist (2021) (0)
- Convolutional Network Cleaned labels Legend Predicted labelsImage Classifier concatenate low dimensional embeddings Label Cleaning Network (2017) (0)
- Appendix : Asynchronous Temporal Fields for Action Recognition (2017) (0)
- Query Image Query Image Retrievals Retrievals Transferred Poses Transferred Poses (2018) (0)
- Structural Inductive Biases in Emergent Communication (2020) (0)
- Analyzing Visual Semantic Processing and Recognizing the Shape Changes in Video (2003) (0)
- Empirically Verifying Hypotheses Using Reinforcement Learning (2020) (0)
- Measuring and increasing the capacity of Natural HOG Statistics (2017) (0)
- Audio-Visual Floorplan Reconstruction Supplementary Material (2021) (0)
- Car Internet Dataset ( b ) Discovered Visual Subcategories and Learned Priors / Models Example Images Example Images Average Image Learned Prior Learned Detector Average Image Learned Prior Learned Detector (2014) (0)
- Supplementary Material for Aligning Videos in Space and Time (2020) (0)
- Visualisation of Relationships Among Library Users Based on Library Circulation Data (2010) (0)
- F " C CNN " Architecture " Output : " " Surface " Normal " " F " C F " C Input " ( I ) : " " RGB " Image " Convolu ? onal " Layers " (2016) (0)
- Hardware Setup : Our robot consists of a Dobot Magician robotic arm (2018) (0)
- Neural Topological SLAM for Visual Navigation: Supplementary Material (2020) (0)
- Self-Activating Neural Ensembles for Continual Reinforcement Learning (2021) (0)
- Deep Learning based Scene Agnostic Image Based Visual Servoing (2021) (0)
- Learning to predict grasping interaction with geometry-aware 3 D representations (2010) (0)
- Rich Representations with Exposed Semantics for Deep Visual Reasoning (2016) (0)
- BOLD5000, a public fMRI dataset while viewing 5000 visual images (2019) (0)
- Supplementary Material Canonical Surface Mapping via Geometric Cycle Consistency (2019) (0)
- KRISP: Supplemental Material (2021) (0)
- Computational Models for Object Detection and Recognition (2004) (0)
- Supplementary Material: Articulation-aware Canonical Surface Mapping (2020) (0)
- A public fMRI dataset of 5000 scenes: a resource for human vision science (2018) (0)
- Leveraging Inexpensive Supervision Signals for Visual Learning (2017) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With Abhinav Kumar Gupta?
Abhinav Kumar Gupta is affiliated with the following schools: