I. V. Laptev
#117,293
Most Influential Person Now
I. V. Laptev's AcademicInfluence.com Rankings
I. V. Laptevcomputer-science Degrees
Computer Science
#4620
World Rank
#4873
Historical Rank
Database
#1827
World Rank
#1916
Historical Rank

Download Badge
Computer Science
I. V. Laptev's Degrees
- Masters Computer Science University of Oxford
- Bachelors Computer Science University of Oxford
Similar Degrees You Can Earn
Why Is I. V. Laptev Influential?
(Suggest an Edit or Addition)I. V. Laptev's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- On Space-Time Interest Points (2003) (4004)
- Recognizing human actions: a local SVM approach (2004) (3881)
- Learning realistic human actions from movies (2008) (3742)
- Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks (2014) (3001)
- Evaluation of Local Spatio-temporal Features for Action Recognition (2009) (1492)
- Actions in context (2009) (1338)
- Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding (2016) (858)
- Is object localization for free? - Weakly-supervised learning with convolutional neural networks (2015) (838)
- Long-Term Temporal Convolutions for Action Recognition (2016) (823)
- Learning from Synthetic Humans (2017) (746)
- HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips (2019) (593)
- P-CNN: Pose-Based CNN Features for Action Recognition (2015) (535)
- Retrieving actions in movies (2007) (495)
- Segmenter: Transformer for Semantic Segmentation (2021) (479)
- End-to-End Learning of Visual Representations From Uncurated Instructional Videos (2019) (457)
- View-Independent Action Recognition from Temporal Self-Similarities (2011) (430)
- Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering (2002) (383)
- Density-aware person detection and tracking in crowds (2011) (349)
- BodyNet: Volumetric Inference of 3D Human Body Shapes (2018) (341)
- The THUMOS challenge on action recognition for videos "in the wild" (2016) (331)
- Video copy detection: a comparative study (2007) (322)
- Automatic annotation of human actions in video (2009) (303)
- Automatic extraction of roads from aerial images based on scale space and snakes (2000) (292)
- Learnable pooling with Context Gating for video classification (2017) (283)
- Learning Joint Reconstruction of Hands and Manipulated Objects (2019) (276)
- Recognizing human actions in still images: a study of bag-of-features and part-based representations (2010) (274)
- ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization (2016) (272)
- Cross-View Action Recognition from Temporal Self-similarities (2008) (251)
- Efficient Feature Extraction, Encoding, and Classification for Action Recognition (2014) (241)
- Unsupervised Learning from Narrated Instruction Videos (2015) (236)
- Local Descriptors for Spatio-temporal Recognition (2004) (232)
- Track to the future: Spatio-temporal video segmentation with long-range motion cues (2011) (230)
- Data-driven crowd analysis in videos (2011) (226)
- Object Detection Using Strongly-Supervised Deformable Part Models (2012) (224)
- Weakly Supervised Action Labeling in Videos under Ordering Constraints (2014) (219)
- Improvements of Object Detection Using Boosted Histograms (2006) (200)
- Local velocity-adapted motion events for spatio-temporal recognition (2007) (191)
- XCiT: Cross-Covariance Image Transformers (2021) (183)
- Learning a Text-Video Embedding from Incomplete and Heterogeneous Data (2018) (174)
- People Watching: Human Actions as a Cue for Single View Geometry (2012) (172)
- Weakly-Supervised Learning of Visual Relations (2017) (166)
- Learning person-object interactions for action recognition in still images (2011) (162)
- Finding Actors and Actions in Movies (2013) (146)
- Improving object detection with boosted histograms (2009) (138)
- Improving bag-of-features action recognition with non-local cues (2010) (135)
- On pairwise costs for network flow multi-object tracking (2014) (130)
- Cross-Task Weakly Supervised Learning From Instructional Videos (2019) (128)
- Weakly-Supervised Alignment of Video with Text (2015) (125)
- Unsupervised Object Discovery and Tracking in Video Collections (2015) (123)
- Context-Aware CNNs for Person Head Detection (2015) (122)
- Scene Semantics from Long-Term Observation of People (2012) (115)
- Just Ask: Learning to Answer Questions from Millions of Narrated Videos (2020) (101)
- Detecting Unseen Visual Relations Using Analogies (2018) (100)
- Periodic motion detection and segmentation via approximate sequence alignment (2005) (100)
- Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction (2020) (97)
- Tracking of Multi-state Hand Models Using Particle Filtering and a Hierarchy of Multi-scale Image Features (2001) (94)
- Training Vision Transformers for Image Retrieval (2021) (89)
- Weakly supervised object recognition with convolutional neural networks (2014) (88)
- Multi-scale and Snakes for Automatic Road Extraction (1998) (84)
- Velocity adaptation of space-time interest points (2004) (82)
- Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers (2021) (74)
- Are Large-scale Datasets Necessary for Self-Supervised Pre-training? (2021) (74)
- Deep Metric Learning Beyond Binary Supervision (2019) (73)
- AUTOMATIC ROAD EXTRACTION BASED ON MULTI-SCALE MODELING, CONTEXT, AND SNAKES (2002) (67)
- Joint Discovery of Object States and Manipulation Actions (2017) (63)
- History Aware Multimodal Transformer for Vision-and-Language Navigation (2021) (59)
- Leveraging the Present to Anticipate the Future in Videos (2019) (56)
- Predicting Actions from Static Scenes (2014) (55)
- Synthetic Humans for Action Recognition from Unseen Viewpoints (2019) (52)
- Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video (2019) (49)
- Will person detection help bag-of-features action recognition? (2010) (49)
- Joint pose estimation and action recognition in image graphs (2011) (48)
- A Prototype System for Computer Vision Based Human Computer Interaction (2001) (44)
- Goal-Conditioned Reinforcement Learning with Imagined Subgoals (2021) (42)
- Airbert: In-domain Pretraining for Vision-and-Language Navigation (2021) (42)
- Learning from Video and Text via Large-Scale Discriminative Clustering (2017) (41)
- Local spatio-temporal image features for motion interpretation (2004) (39)
- Monte-Carlo Tree Search for Efficient Visually Guided Rearrangement Planning (2019) (36)
- Multi-view Synchronization of Human Actions and Dynamic Scenes (2009) (35)
- Instance-Level Video Segmentation from Object Tracks (2016) (34)
- Interest Point Detection and Scale Selection in Space-Time (2003) (33)
- A flexible model for training action localization with varying levels of supervision (2018) (33)
- Pose Estimation and Segmentation of People in 3D Movies (2013) (32)
- Learning to Augment Synthetic Images for Sim2Real Policy Transfer (2019) (31)
- Differentiable Simulation for Physical System Identification (2021) (31)
- Velocity adaptation of spatio-temporal receptive fields for direct recognition of activities: an experimental study (2004) (28)
- Learning Interactions and Relationships Between Movie Characters (2020) (25)
- MobileFace: 3D Face Reconstruction with Efficient CNN Regression (2018) (25)
- Much Ado About Time: Exhaustive Annotation of Temporal Data (2016) (24)
- Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation (2022) (24)
- Semi-supervised Learning of Facial Attributes in Video (2010) (24)
- Learning to combine primitive skills: A step towards versatile robotic manipulation § (2019) (23)
- Zero-Shot Video Question Answering via Frozen Bidirectional Language Models (2022) (22)
- Actlets: A novel local representation for human action recognition in video (2012) (21)
- TubeDETR: Spatio-Temporal Video Grounding with Transformers (2022) (20)
- Extraction of linear objects from interferometric SAR data (2002) (20)
- Thin-Slicing for Pose: Learning to Understand Pose without Explicit Pose Estimation (2016) (19)
- Action Modifiers: Learning From Adverbs in Instructional Videos (2019) (18)
- Learning from Narrated Instruction Videos (2015) (17)
- Detecting rare visual relations using analogies (2018) (17)
- A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching (2003) (17)
- Learning Obstacle Representations for Neural Motion Planning (2020) (17)
- Towards Unconstrained Joint Hand-Object Reconstruction From RGB Videos (2021) (17)
- Towards reliable object detection in noisy images (2017) (16)
- Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies (2015) (16)
- Velocity-adapted spatio-temporal receptive fields for direct recognition of activities (2002) (15)
- Road Extraction Based on Snakes and Sophisticated Line Extraction (1997) (15)
- Learning Actionness via Long-Range Temporal Order Verification (2020) (15)
- Galilean-diagonalized spatio-temporal interest operators (2004) (14)
- Road Extraction Based on Line Extraction and Snakes (1997) (14)
- Learning Object Manipulation Skills via Approximate State Estimation from Real Videos (2020) (14)
- Robust change detection in dense urban areas via SVM classifier (2009) (13)
- Instruction-driven history-aware policies for robotic manipulations (2022) (13)
- Galilean-corrected spatio-temporal interest operators (2004) (12)
- Learning to Answer Visual Questions from Web Videos (2022) (10)
- RareAct: A video dataset of unusual interactions (2020) (10)
- INRIA-WILLOW at TRECVID 2010 : Surveillance Event Detection (2010) (9)
- Editorial- Deep Learning for Computer Vision (2017) (9)
- On Pairwise Cost for Multi-Object Network Flow Tracking (2014) (8)
- Analysis of Crowded Scenes in Video (2013) (8)
- EPIC-KITCHENS-2019 Challenges Report (2019) (8)
- Differentiable Rendering with Perturbed Optimizers (2021) (8)
- Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (2022) (7)
- Modeling Image Context Using Object Centered Grid (2009) (7)
- Modeling and visual recognition of human actions and interactions (2013) (6)
- Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems (2022) (6)
- Learning to Localize and Align Fine-Grained Actions to Sparse Instructions (2018) (6)
- Unsupervised object discovery and localization in images and videos (2015) (6)
- Margin based knowledge distillation for mobile face recognition (2020) (5)
- The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020) (2020) (5)
- Painting recognition from wearable cameras (2014) (5)
- Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos (2022) (5)
- Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning (2023) (5)
- Occlusion resistant learning of intuitive physics from videos (2019) (5)
- Combining learned skills and reinforcement learning for robotic manipulations (2019) (4)
- Image Compression with Product Quantized Masked Image Modeling (2022) (4)
- AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction (2022) (4)
- A Multi-scale Feature Likelihood Map for Direct Evaluation of Object Hypotheses (2001) (3)
- View-independent Video Synchronization from Temporal Self-similarities (2009) (3)
- The Analysis of High Density Crowds in Videos (2017) (3)
- Modeling Spatio-Temporal Human Track Structure for Action Localization (2018) (3)
- Learning visual policies for building 3D shape categories (2020) (3)
- Weakly-supervised segmentation of referring expressions (2022) (2)
- AUTOMATED PIPELINE EXTRACTION FROM INTERFEROMETRIC SAR DATA OF THE ERS TANDEM MISSION (1998) (2)
- Long term spatio-temporal modeling for action detection (2021) (2)
- Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (2022) (2)
- Augmenting differentiable physics with randomized smoothing (2022) (2)
- Estimating 3D Motion and Forces of Human–Object Interactions from Internet Videos (2021) (2)
- People Watching: Human Actions as a Cue for Single View Geometry (2014) (1)
- Recognizing Human Action in the Wild (2010) (1)
- Tube-CNN: Modeling temporal evolution of appearance for object detection in video (2018) (1)
- Computer Vision – ECCV 2018 Workshops (2018) (1)
- Automatic Activity Recognition for Video Surveillance (2020) (0)
- Bilateral Contracts and Grants with Industry - Google: Learning to annotate videos from movie scripts (Inria) (2014) (0)
- gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction (2023) (0)
- Partnerships and Cooperations - European Initiatives (2011) (0)
- Velocity-adapted spatio-temporal image descriptors for direct recognition of activities (2015) (0)
- Reconstructing and grounding narrated instructional videos in 3D (2021) (0)
- New Results - Dynamic event modeling, learning and recognition (2006) (0)
- Input Clip Video RGB 57 . 0 59 . 9 MPEG flow 58 . (2017) (0)
- Vision Spatio-Temporelle et Apprentissage (2007) (0)
- Multi-Task Learning of Object State Changes from Uncurated Videos (2022) (0)
- Estimating 3D Motion and Forces of Human–Object Interactions from Internet Videos (2022) (0)
- Partnerships and Cooperations - International Initiatives (2014) (0)
- Weakly Supervised Learning from Images and Video (2016) (0)
- Bilateral Contracts and Grants with Industry - MSR-Inria joint lab: Image and video mining for science andhumanities (Inria) (2014) (0)
- New Results - Category-level object and scene recognition (2014) (0)
- Contact Models in Robotics: a Comparative Analysis (2023) (0)
- Guest Editorial: Video Recognition (2016) (0)
- New Results - Motion estimation and matching (2006) (0)
- Context-aware Deep Network Models for Weakly Supervised Object Localization Supplementary material (2016) (0)
- Meringue Pour egg Add sugar Whisk mixture ... Making Pancakes Pour mixture Making Lemonade Pour (2019) (0)
- Contracts and Grants with Industry - DGA: CrowdChecker (ENS and E-vitech) (2011) (0)
- New Results - Recognition in video (2007) (0)
- Contracts and Grants with Industry - CrowdChecker (ENS) (2010) (0)
- Learning Interactions and Relationships between Movie Characters SUPPLEMENTARY MATERIAL (2020) (0)
- New Results - Video interpretation (2008) (0)
- Recognizing person interactions (2013) (0)
- Bilateral Contracts and Grants with Industry - Google: Structured learning from video and natural language (Inria) (2015) (0)
- Ability of human operators to distinguish extended objects on the basis of their images (1999) (0)
- Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control (2022) (0)
- TubeDETR: Spatio-Temporal Video Grounding with Transformers Supplementary Material (2022) (0)
- Guest Editorial: Video Recognition (2016) (0)
- New Results - Human activity capture and classification (2011) (0)
- Towards reliable object detection in noisy images (2017) (0)
- Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation (2022) (0)
- Bilateral Contracts and Grants with Industry - Facebook AI Research Paris: Weakly-supervised interpretation of image and video data (Inria) (2015) (0)
- feature extraction, encoding and classication for action recognition (2014) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With I. V. Laptev?
I. V. Laptev is affiliated with the following schools: