Cordelia Schmid
#56,820
Most Influential Person Now
Cordelia Schmid's AcademicInfluence.com Rankings
Cordelia Schmidcomputer-science Degrees
Computer Science
#1871
World Rank
#1944
Historical Rank
Algorithms
#28
World Rank
#28
Historical Rank
Database
#124
World Rank
#127
Historical Rank
Download Badge
Computer Science
Why Is Cordelia Schmid Influential?
(Suggest an Edit or Addition)Cordelia Schmid's Published Works
Published Works
- Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories (2006) (8641)
- A performance evaluation of local descriptors (2003) (4625)
- Scale & Affine Invariant Interest Point Detectors (2004) (4254)
- A performance evaluation of local descriptors (2005) (4073)
- Learning realistic human actions from movies (2008) (3742)
- A Comparison of Affine Region Detectors (2005) (3456)
- Action Recognition with Improved Trajectories (2013) (3094)
- Aggregating local descriptors into a compact image representation (2010) (2500)
- Action recognition by dense trajectories (2011) (2322)
- Product Quantization for Nearest Neighbor Search (2011) (2280)
- Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study (2006) (2193)
- A Spatio-Temporal Descriptor Based on 3D-Gradients (2008) (1966)
- Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search (2008) (1926)
- Human Detection Using Oriented Histograms of Flow and Appearance (2006) (1877)
- Local Grayvalue Invariants for Image Retrieval (1997) (1822)
- Evaluation of Interest Point Detectors (2000) (1764)
- An Affine Invariant Interest Point Detector (2002) (1712)
- Dense Trajectories and Motion Boundary Descriptors for Action Recognition (2013) (1654)
- Aggregating Local Image Descriptors into Compact Codes (2012) (1509)
- Evaluation of Local Spatio-temporal Features for Action Recognition (2009) (1492)
- Indexing based on scale invariant interest points (2001) (1410)
- Actions in context (2009) (1338)
- Description of interest regions with local binary patterns (2009) (1258)
- A sparse texture representation using local affine regions (2005) (1204)
- DeepFlow: Large Displacement Optical Flow with Deep Matching (2013) (1003)
- Is that you? Metric learning approaches for face identification (2009) (886)
- VideoBERT: A Joint Model for Video and Language Representation Learning (2019) (835)
- Long-Term Temporal Convolutions for Action Recognition (2016) (823)
- Human Detection Based on a Probabilistic Assembly of Robust Part Detectors (2004) (785)
- Improving Bag-of-Features for Large Scale Image Search (2010) (775)
- Learning from Synthetic Humans (2017) (746)
- EpicFlow: Edge-preserving interpolation of correspondences for optical flow (2015) (738)
- ViViT: A Video Vision Transformer (2021) (735)
- What makes for good views for contrastive learning (2020) (733)
- TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation (2009) (730)
- Towards Understanding Action Recognition (2013) (710)
- Learning Color Names for Real-World Applications (2009) (699)
- AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (2017) (689)
- End-to-End Incremental Learning (2018) (669)
- Groups of Adjacent Contour Segments for Object Detection (2008) (631)
- Label-Embedding for Attribute-Based Classification (2013) (589)
- Label-Embedding for Image Classification (2015) (578)
- Coloring Local Feature Extraction (2006) (553)
- P-CNN: Pose-Based CNN Features for Action Recognition (2015) (535)
- Segmenter: Transformer for Semantic Segmentation (2021) (479)
- Learning object class detectors from weakly annotated video (2012) (468)
- On the burstiness of visual elements (2009) (456)
- The 2005 PASCAL Visual Object Classes Challenge (2005) (443)
- Constructing models for content-based image retrieval (2001) (431)
- Action and Event Recognition with Fisher Vectors on a Compact Feature Set (2013) (428)
- Evaluation of GIST descriptors for web-scale image search (2009) (427)
- 3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints (2006) (420)
- Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning (2015) (409)
- Multimodal semi-supervised learning for image classification (2010) (394)
- SfM-Net: Learning of Structure and Motion from Video (2017) (387)
- High-dimensional data clustering (2006) (386)
- Category-Specific Video Summarization (2014) (377)
- Description of Interest Regions with Center-Symmetric Local Binary Patterns (2006) (367)
- Selection of scale-invariant parts for object class recognition (2003) (359)
- Convolutional Kernel Networks (2014) (356)
- Comparing and evaluating interest points (1998) (355)
- From Images to Shape Models for Object Detection (2010) (349)
- Combining efficient object localization and image classification (2009) (349)
- Multi-region Two-Stream R-CNN for Action Detection (2016) (342)
- BodyNet: Volumetric Inference of 3D Human Body Shapes (2018) (341)
- Incremental Learning of Object Detectors without Catastrophic Forgetting (2017) (338)
- Automatic line matching across views (1997) (332)
- Multi-modal Transformer for Video Retrieval (2020) (319)
- Semantic Hierarchies for Visual Object Recognition (2007) (318)
- Learning to Track for Spatio-Temporal Action Localization (2015) (316)
- VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation (2020) (314)
- Proceedings. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2005) (306)
- A Robust and Efficient Video Representation for Action Recognition (2015) (304)
- A contextual dissimilarity measure for accurate and efficient image search (2007) (303)
- Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study (2006) (301)
- Semi-Local Affine Parts for Object Recognition (2004) (300)
- Learning to Parse Pictures of People (2002) (296)
- Learning Video Object Segmentation with Visual Memory (2017) (277)
- Learning Joint Reconstruction of Hands and Manipulated Objects (2019) (276)
- LCR-Net: Localization-Classification-Regression for Human Pose (2017) (267)
- Discriminative spatial saliency for image classification (2012) (264)
- Action Tubelet Detector for Spatio-Temporal Action Localization (2017) (264)
- DeepMatching: Hierarchical Deformable Dense Matching (2015) (262)
- How good is my GAN? (2018) (261)
- Toward Category-Level Object Recognition (2006) (254)
- Dataset Issues in Object Recognition (2006) (251)
- Scale-invariant shape features for recognition of object categories (2004) (250)
- AUTOMATIC LINE MATCHING AND 3D RECONSTRUCTION OF BUILDINGS FROM MULTIPLE VIEWS (1999) (250)
- Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals (2015) (250)
- Shape recognition with edge-based features (2003) (246)
- MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild (2016) (244)
- Viewpoint-independent object class detection using 3D Feature Maps (2008) (237)
- Multi-fold MIL Training for Weakly Supervised Object Localization (2014) (235)
- Multi-view object class detection with a 3D geometric model (2010) (234)
- Learning Motion Patterns in Videos (2016) (231)
- Weakly Supervised Learning of Interactions between Humans and Objects (2012) (230)
- LCR-Net++: Multi-Person 2D and 3D Pose Detection in Natural Images (2018) (226)
- Spatio-temporal Object Detection Proposals (2014) (224)
- PoTion: Pose MoTion Representation for Action Recognition (2018) (220)
- Weakly Supervised Action Labeling in Videos under Ordering Constraints (2014) (219)
- Learning Video Representations using Contrastive Bidirectional Transformer (2019) (214)
- Vector Quantizing Feature Space with a Regular Lattice (2007) (213)
- A maximum entropy framework for part-based texture and object recognition (2005) (213)
- Accurate Image Search Using the Contextual Dissimilarity Measure (2010) (207)
- Combining attributes and Fisher vectors for efficient image retrieval (2011) (206)
- A sparse texture representation using affine-invariant regions (2003) (201)
- Learning Color Names from Real-World Images (2007) (197)
- Towards good practice in large-scale learning for image classification (2012) (197)
- TNT: Target-driveN Trajectory Prediction (2020) (195)
- Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study (2005) (192)
- Packing bag-of-features (2009) (192)
- Matching images with different resolutions (2000) (191)
- The Geometry and Matching of Lines and Curves Over Multiple Views (2000) (186)
- Constructing Category Hierarchies for Visual Recognition (2008) (185)
- Multiple Instance Metric Learning from Automatically Labeled Bags of Faces (2010) (185)
- An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering (2010) (185)
- White-box vs Black-box: Bayes Optimal Strategies for Membership Inference (2019) (181)
- Unsupervised metric learning for face identification in TV video (2011) (181)
- MARS: Motion-Augmented RGB Stream for Action Recognition (2019) (179)
- Spatial Weighting for Bag-of-Features (2006) (179)
- Human Focused Action Localization in Video (2010) (178)
- Areas of Attention for Image Captioning (2016) (173)
- Actom sequence models for efficient action detection (2011) (173)
- Temporal Localization of Actions with Actoms (2013) (171)
- Learning Object Representations for Visual Object Class Recognition (2007) (170)
- Attention Bottlenecks for Multimodal Fusion (2021) (169)
- Self-Supervised Learning With Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera (2019) (168)
- Accurate Object Detection with Deformable Shape Models Learnt from Images (2007) (167)
- Local Convolutional Features with Unsupervised Training for Image Retrieval (2015) (167)
- 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints (2003) (166)
- Weakly-Supervised Learning of Visual Relations (2017) (166)
- Object Class Recognition Using Discriminative Local Features (2005) (165)
- Actor-Centric Relation Network (2018) (163)
- Face Detection and Tracking in a Video by Propagating Detection Probabilities (2003) (163)
- Modeling Visual Context is Key to Augmenting Object Detection Datasets (2018) (163)
- Combining greyvalue invariants with local constraints for object recognition (1996) (159)
- Diversity With Cooperation: Ensemble Methods for Few-Shot Classification (2019) (158)
- Object Recognition by Integrating Multiple Image Segmentations (2008) (154)
- Affine-invariant local descriptors and neighborhood statistics for texture recognition (2003) (148)
- BlitzNet: A Real-Time Deep Network for Scene Understanding (2017) (147)
- Finding Actors and Actions in Movies (2013) (146)
- Using High-Level Visual Information for Color Constancy (2007) (141)
- Conference on Computer Vision and Pattern Recognition (2005) (137)
- Accurate Object Localization with Shape Masks (2007) (131)
- Segmentation Driven Object Detection with Fisher Vectors (2013) (130)
- Contrastive Bidirectional Transformer for Temporal Representation Learning (2019) (129)
- High-Dimensional Discriminant Analysis (2005) (128)
- Explicit Modeling of Human-Object Interactions in Realistic Videos (2013) (128)
- Event Retrieval in Large Video Collections with Circulant Temporal Encoding (2013) (126)
- Weakly-Supervised Alignment of Video with Text (2015) (125)
- Proposal Flow (2015) (125)
- Unsupervised Object Discovery and Tracking in Video Collections (2015) (123)
- The LEAR submission at Thumos 2014 (2014) (121)
- SCNet: Learning Semantic Correspondence (2017) (120)
- Mixing Body-Part Sequences for Human Pose Estimation (2014) (115)
- Activity representation with motion hierarchies (2014) (111)
- Automatic face naming with caption-based supervision (2008) (106)
- Expanded Parts Model for Human Attribute and Action Recognition in Still Images (2013) (105)
- Image annotation with tagprop on the MIRFLICKR set (2010) (103)
- Proposal Flow: Semantic Correspondences from Object Proposals (2017) (103)
- Online Object Tracking with Proposal Selection (2015) (103)
- Moulding Humans: Non-Parametric 3D Human Shape Estimation From Single Images (2019) (102)
- Just Ask: Learning to Answer Questions from Millions of Narrated Videos (2020) (101)
- Hamming Embedding and Weak Geometry Consistency for Large Scale Image Search - extended version (2008) (101)
- Transformation Pursuit for Image Classification (2014) (100)
- Detecting Unseen Visual Relations Using Analogies (2018) (100)
- Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction (2020) (97)
- Flexible Object Models for Category-Level 3D Object Recognition (2007) (96)
- Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects (2004) (95)
- Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection (2019) (93)
- Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos (2018) (92)
- Actor and Observer: Joint Modeling of First and Third-Person Videos (2018) (91)
- Occlusion and Motion Reasoning for Long-Term Tracking (2014) (86)
- Multiview Transformers for Video Recognition (2022) (83)
- Face Recognition from Caption-Based Supervision (2011) (83)
- TAO: A Large-Scale Benchmark for Tracking Any Object (2020) (83)
- Efficient Action Localization with Approximately Normalized Fisher Vectors (2014) (82)
- Correlation-based burstiness for logo retrieval (2012) (82)
- A Structured Model for Action Detection (2018) (82)
- Applying Color Names to Image Description (2007) (82)
- INRIA-LEAR'S Video Copy Detection System (2008) (76)
- Recognizing activities with cluster-trees of tracklets (2012) (76)
- Learning to Segment Moving Objects (2017) (76)
- Episodic Transformer for Vision-and-Language Navigation (2021) (75)
- Memory-Efficient Incremental Learning Through Feature Adaptation (2020) (73)
- Image matching with scale adjustment (2004) (72)
- Selecting Relevant Features from a Multi-domain Representation for Few-Shot Classification (2020) (71)
- A structured probabilistic model for recognition (1999) (69)
- Estimating Human Pose with Flowing Puppets (2013) (66)
- Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach (2016) (66)
- Joint Learning of Object and Action Detectors (2017) (65)
- Learning to detect Motion Boundaries (2015) (63)
- Learning to Recognize Objects with Little Supervision (2008) (63)
- Human Action Localization with Sparse Spatial Supervision (2017) (63)
- Image categorization using Fisher kernels of non-iid image models (2012) (63)
- Query adaptative locality sensitive hashing (2008) (62)
- Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval (2004) (62)
- Relational Action Forecasting (2019) (62)
- The Geometry and Matching of Curves in Multiple Views (1998) (62)
- Analysing Domain Shift Factors between Videos and Images for Object Detection (2015) (61)
- Spreading vectors for similarity search (2018) (60)
- History Aware Multimodal Transformer for Vision-and-Language Navigation (2021) (59)
- Spatial pyramid matching (2009) (59)
- On the Importance of Visual Context for Data Augmentation in Scene Understanding (2018) (56)
- Comparison of affine-invariant local detectors and descriptors (2004) (56)
- Face detection in a video sequence - a temporal approach (2001) (54)
- Matching by local invariants (1995) (53)
- Good Practice in Large-Scale Learning for Image Classification (2019) (52)
- Synthetic Humans for Action Recognition from Unseen Viewpoints (2019) (52)
- Learning shape prior models for object matching (2009) (51)
- Compact Video Description for Copy Detection with Precise Temporal Alignment (2010) (50)
- Temporal localization of actions with actoms. (2013) (50)
- Will person detection help bag-of-features action recognition? (2010) (49)
- A time series kernel for action recognition (2011) (48)
- Weakly-Supervised Semantic Segmentation Using Motion Cues (2016) (48)
- Towards Weakly-Supervised Action Localization (2016) (44)
- Leveraging the Path Signature for Skeleton-based Human Action Recognition (2017) (44)
- Exploiting descriptor distances for precise image search (2011) (44)
- Combining Regions and Patches for Object Class Localization (2006) (44)
- Toward Category-Level Object Recognition (Lecture Notes in Computer Science) (2007) (43)
- Airbert: In-domain Pretraining for Vision-and-Language Navigation (2021) (42)
- Goal-Conditioned Reinforcement Learning with Imagined Subgoals (2021) (42)
- Speech2Action: Cross-Modal Supervision for Action Recognition (2020) (41)
- Accurate Object Recognition with Shape Masks (2012) (39)
- End-to-end Generative Pretraining for Multimodal Video Captioning (2022) (39)
- Stable Hyper-pooling and Query Expansion for Event Detection (2013) (37)
- AXES at TRECVID 2012: KIS, INS, and MED (2012) (36)
- Blur Robust and Color Constant Image Description (2006) (35)
- Expanded Parts Model for Semantic Description of Humans in Still Images (2015) (35)
- Radioactive data: tracing through training (2020) (35)
- The AXES submissions at TRECVID 2013 (2013) (35)
- Look Before you Speak: Visually Contextualized Utterances (2020) (34)
- Optimized Generic Feature Learning for Few-shot Classification across Domains (2020) (33)
- A flexible model for training action localization with varying levels of supervision (2018) (33)
- Maximally Stable Local Description for Scale Selection (2006) (32)
- Learning to Augment Synthetic Images for Sim2Real Policy Transfer (2019) (31)
- A Discriminative Framework for Texture and Object Recognition Using Local Image Features (2006) (31)
- Differentiable Simulation for Physical System Identification (2021) (31)
- Computer Vision – ECCV 2012 (2012) (30)
- Mining Visual Actions from Movies (2009) (29)
- Circulant Temporal Encoding for Video Retrieval and Temporal Alignment (2015) (29)
- Pattern recognition with local invariant features (2005) (28)
- Learning Video Representations from Textual Web Supervision (2020) (28)
- Developing the Path Signature Methodology and its Application to Landmark-based Human Action Recognition (2017) (27)
- Approximate Fisher Kernels of Non-iid Image Models for Image Categorization (2015) (25)
- Image-Based Synthesis for Deep 3D Human Pose Estimation (2018) (25)
- Large-Scale Unsupervised Object Discovery (2021) (25)
- Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation (2022) (24)
- Learning to combine primitive skills: A step towards versatile robotic manipulation § (2019) (23)
- Face detection based on generic local descriptors and spatial constraints (2000) (23)
- Unsupervised Learning of Artistic Styles with Archetypal Style Analysis (2018) (22)
- Zero-Shot Video Question Answering via Frozen Bidirectional Language Models (2022) (22)
- Searching with quantization: approximate nearest neighbor search using short codes and distance estimators (2009) (22)
- Region-Based Image Classification with a Latent SVM Model (2011) (22)
- Unified Graph Structured Models for Video Understanding (2021) (22)
- Towards true 3D object recognition (2004) (22)
- Selecting Relevant Features from a Universal Representation for Few-shot Classification (2020) (21)
- Learning Audio-Video Modalities from Image Captions (2022) (21)
- CCVS: Context-aware Controllable Video Synthesis (2021) (20)
- A Semi-supervised Learning Approach to Object Recognition with Spatial Integration of Local Features and Segmentation Cues (2006) (20)
- Unsupervised Learning of Video Representations via Dense Trajectory Clustering (2020) (20)
- TubeDETR: Spatio-Temporal Video Grounding with Transformers (2022) (20)
- Bayesian Decision Versus Voting for Image Retrieval (1997) (20)
- HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps (2021) (19)
- INRIA @TRECVID 2011: Copy Detection & Multimedia Event Detection (2011) (19)
- Adaptive Density Estimation for Generative Models (2019) (19)
- Auto-calibration by direct observation of objects (1993) (19)
- INRIA LEAR-TEXMEX: Video Copy Detection Task (2010) (19)
- Image retrieval using local characterization (1996) (18)
- Encoding Feature Maps of CNNs for Action Recognition (2015) (18)
- Learning Obstacle Representations for Neural Motion Planning (2020) (17)
- International Conference on Computer Vision (ICCV 2017) (2017) (17)
- Detecting rare visual relations using analogies (2018) (17)
- Towards Unconstrained Joint Hand-Object Reconstruction From RGB Videos (2021) (17)
- Maintaining stereo calibration by tracking image points (1993) (16)
- The Pascal Visual Object Classes Challenge 2008 submission (2008) (15)
- Class-Balanced Distillation for Long-Tailed Visual Recognition (2021) (15)
- Masking Modalities for Cross-modal Video Retrieval (2021) (15)
- Improving robustness against common corruptions with frequency biased models (2021) (15)
- Software - Histogram of oriented gradient object detection (2006) (14)
- Classification aided two stage localization (2008) (14)
- Supplementary Material: AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection (2019) (14)
- Déjà Vu: an empirical evaluation of the memorization properties of ConvNets (2018) (14)
- On Pencils of Tangent Planes and the Recognition of Smooth 3D Shapes from Silhouettes (2002) (14)
- Self-Supervised Learning of Structure and Motion from Video (2017) (13)
- Learning with Neighbor Consistency for Noisy Labels (2022) (13)
- Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos (2020) (13)
- Instruction-driven history-aware policies for robotic manipulations (2022) (13)
- Graph convolutional networks for learning with few clean and many noisy labels (2019) (12)
- Learning Semantic Segmentation with Weakly-Annotated Videos (2016) (12)
- Object Localization by Subspace Clustering of Local Descriptors (2006) (12)
- Deep Convolutional Matching (2015) (11)
- Recent Advances in Large Scale Image Search (2008) (11)
- Local Metrics for Multi-Object Tracking (2021) (11)
- The AXES PRO video search system (2013) (10)
- Learning to Answer Visual Questions from Web Videos (2022) (10)
- INRIA-LEARs participation to ImageCLEF 2009 (2009) (10)
- A maximum entropy framework for combining parts and relations for texture and object recognition (2005) (10)
- 3D Object Modeling and Recognition from Photographs and Image Sequences (2006) (10)
- Markov Random Fields for Textures Recognition with Local Invariant Regions and their Geometric Relationships (2005) (9)
- Learning From Web Videos for Event Classification (2018) (9)
- Dimension Reduction and Classification Methods for Object Recognition in Vision (2004) (9)
- Composable Augmentation Encoding for Video Representation Learning (2021) (9)
- Class-Specific Subspace Discriminant Analysis for High-Dimensional Data (2005) (9)
- Markov Random Fields for Recognizing textures modeled by Feature Vectors (2005) (8)
- Residual Reinforcement Learning from Demonstrations (2021) (8)
- A Study on Action Detection in the Wild (2019) (8)
- Learning to Track Any Object (2019) (8)
- Differentiable Rendering with Perturbed Optimizers (2021) (8)
- Learning Temporal Dynamics from Cycles in Narrated Video (2021) (7)
- Obstacle detection analysis (1994) (7)
- Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (2022) (7)
- Dynamic calibration of an active stereo head (1993) (7)
- Computer Vision – ECCV 2012 (2012) (7)
- The INRIA-LIM-VocR and AXES submissions to TrecVid 2014 Multimedia Event Detection (2014) (7)
- An Image Oriented CAD Approach (1996) (7)
- The AXES research video search system (2014) (7)
- Bayesian learning for weakly supervised object classification (2004) (6)
- Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems (2022) (6)
- Learning local affine-invariant part models for object class recognition (2004) (6)
- 3D Photography from Photographs and Video Clips (2002) (6)
- Unsupervised object discovery and localization in images and videos (2015) (6)
- Color Names (2017) (6)
- Integrating Geometric and Photometric Information for Image Retrieval (1999) (6)
- Software - Histogram of Oriented Gradient Object Detection Toolkit (2005) (5)
- Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning (2023) (5)
- What makes for good views for contrastive representation learning (2020) (5)
- Attribute-Based Classification with Label-Embedding (2013) (5)
- The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020) (2020) (5)
- Adversarial training of partially invertible variational autoencoders (2019) (5)
- Coverage and Quality Driven Training of Generative Image Models (2018) (5)
- Modulated Policy Hierarchies (2018) (5)
- Focused Attention for Action Recognition (2019) (5)
- M&M Mix: A Multimodal Multiview Transformer Ensemble (2022) (5)
- TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency (2022) (4)
- A neural network catalyzer for multi-dimensional similarity search (2018) (4)
- Combining learned skills and reinforcement learning for robotic manipulations (2019) (4)
- Representing, learning, and recognizing non-rigid textures and texture categories (2003) (4)
- A Memory Transformer Network for Incremental Learning (2022) (4)
- AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction (2022) (4)
- Detecting Parts for Action Localization (2017) (4)
- Effient Matching with Invariant Local Descriptors (1998) (4)
- Learning visual policies for building 3D shape categories (2020) (3)
- Computer Aided (dis)Assembly Using Visual Cues (1996) (3)
- Beyond Transfer Learning: Co-finetuning for Action Localisation (2022) (3)
- Building and using hypervideos (1998) (3)
- TRECVID ’ 2011 : Copy Detection & Multimedia Event Detection (2011) (3)
- Beat-Event Detection in Action Movie Franchises (2015) (3)
- Feature selection for object class detection (2003) (3)
- Modeling Spatio-Temporal Human Track Structure for Action Localization (2018) (3)
- Consistency Guided Scene Flow Estimation (2020) (3)
- A Robust and Efficient Video Representation for Action Recognition (2015) (2)
- Augmenting differentiable physics with randomized smoothing (2022) (2)
- Image retrieval in the presence of important scale changes and with automatically constructed models (2001) (2)
- AVATAR: Unconstrained Audiovisual Speech Recognition (2022) (2)
- Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (2022) (2)
- Computer Vision – ECCV 2012 (2012) (2)
- Weakly-supervised segmentation of referring expressions (2022) (2)
- Recognition with local photometric invariants (2004) (2)
- Beyond the Camera: Neural Networks in World Coordinates (2020) (2)
- New Software and Platforms - Convolutional Kernel Networks (2014) (2)
- Proceedings of the 12th European conference on Computer Vision - Volume Part I (2012) (2)
- REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory (2022) (2)
- Classification of high dimensional data: High Dimensional Discriminant Analysis (2005) (2)
- Tubelet Detector for Spatio-Temporal Action Localization (2017) (1)
- Audiovisual Masked Autoencoders (2022) (1)
- PhotoMole: retrieval from a database of natural images (2005) (1)
- Combining geometric and photometric information (1998) (1)
- Action Detection with Actom Sequence Models (2012) (1)
- The Right Spin: Learning Object Motion from Rotation-Compensated Flow Fields (2022) (1)
- CVPR 2020 Video Pentathlon Challenge: Multi-modal Transformer for Video Retrieval (2020) (1)
- AXES at TRECVid 2013 (2013) (1)
- Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach (2016) (0)
- Contracts and Grants with Industry - MSR-INRIA joint lab: scientific image and video mining (2010) (0)
- New Results - Image description and correspondence (2005) (0)
- Areas of Attention for Image Captioning — Supplementary Material — (2017) (0)
- Supplementary Material for Speech2Action: Cross-modal Supervision for Action Recognition (2020) (0)
- Contracts and Grants with Industry - MDBD Aerospatiale (2008) (0)
- Contracts and Grants with Industry - MDBA Aerospatiale (2010) (0)
- Do you see what I see?: Large-scale Learning from Multimodal Videos (2021) (0)
- Action Localization with Approximately (2014) (0)
- Contracts and Grants with Industry - MBDA Aerospatiale (2006) (0)
- Other Grants and Activities - European Projects (2004) (0)
- End-to-End Spatio-Temporal Action Localisation with Video Transformers (2023) (0)
- New Results - Visual recognition in images (2013) (0)
- Cosegmentation of object categories (2012) (0)
- Verbs in Action: Improving verb understanding in video-language models (2023) (0)
- Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation (2022) (0)
- Input Clip Video RGB 57 . 0 59 . 9 MPEG flow 58 . (2017) (0)
- Proposal Flow Supplement (2016) (0)
- Other Grants and Activities - Bilateral relationships (2006) (0)
- New Software and Platforms - EpicFlow (2014) (0)
- Inferring the Structure of Action Movies (2017) (0)
- Gait recognition applying Incremental learning (2019) (0)
- Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval (2023) (0)
- Learning and Recognition in Vision (2006) (0)
- Software - Software for computing local invariant features (2004) (0)
- Proceedings, Part II, of the 12th European Conference on Computer Vision --- ECCV 2012 - Volume 7573 (2012) (0)
- Images Instances Using context guidance Random instance placement Copy-Paste Data Augmentation New Training Examples (2018) (0)
- Supplementary Material: Unified Graph Structured Models for Video Understanding (2021) (0)
- Learning to Segment Moving Objects (2018) (0)
- New Results - Human detection and activity analysis (2005) (0)
- Other Grants and Activities - International Projects (2008) (0)
- New Results - Image description (2004) (0)
- Activity representation with motion hierarchies (2013) (0)
- Weakly supervised learning from images and videos∗ (2015) (0)
- Software - Large-scale image indexing (2008) (0)
- Jet-based local image descriptors Jet-Based Local Image Descriptors (2012) (0)
- 2005 Conference on Computer Vision and Pattern Recognition CVPR 2005 (2008) (0)
- Automatic construction of visual models (2001) (0)
- Large Scale Image Search (2008) (0)
- AVATAR submission to the Ego4D AV Transcription Challenge (2022) (0)
- WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction (2022) (0)
- Dense Trajectories and Motion Boundary Descriptors for Action Recognition (2013) (0)
- Organizing Committee and Area Chairs (2018) (0)
- Software - Image search demonstrator (2008) (0)
- Action Recognition in Videos (2008) (0)
- Object Segmentation with Visual Memory (0)
- Edinburgh Research Explorer Joint learning of object and action detectors (2018) (0)
- New Results - Human action recognition (2010) (0)
- New Results - Video interpretation (2008) (0)
- Software and Platforms - Object category localization (2013) (0)
- Color Names Portions Reprinted, with Permission, from 'learning Color Names for Real-world (2009) (0)
- Location-Aware Self-Supervised Transformers for Semantic Segmentation (2022) (0)
- Bilateral Contracts and Grants with Industry - Xerox Research Center Europe (2013) (0)
- Software and Platforms - Fisher vector image representation (2013) (0)
- Edinburgh Research Explorer Segmentation Propagation in ImageNet (2018) (0)
- Project/Team LEAR : Learning and Recognition in Vision (2006) (0)
- Circulant Temporal Encoding for Video Retrieval and Temporal Alignment (2015) (0)
- New Results - Recognition in video (2007) (0)
- Location-Aware Self-Supervised Transformers (2022) (0)
- Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification (2023) (0)
- Software - Extracting and describing interest points (2006) (0)
- Automatic Recognition of Human Activities in Realistic Videos (2013) (0)
- TubeDETR: Spatio-Temporal Video Grounding with Transformers Supplementary Material (2022) (0)
- New Results - Visual recognition in videos (2016) (0)
- Software and Platforms - Video descriptors (2013) (0)
- Image-Based Synthesis for Deep 3D Human Pose Estimation (2018) (0)
- New Results - Supervised methods for visual object recognition and localization (2009) (0)
- Contracts and Grants with Industry - Start-up Milpix (2008) (0)
- New Results - Visual object recognition (2007) (0)
- Toward True 3 D Object Recognition (2003) (0)
- RGB TVL 1 Flow RGB + TVL 1 FlowMARS MARS + RGB MERS MERS + RGB Accuracy vs Time on MiniKinetics (2019) (0)
- New Results - Recognition (2004) (0)
- RECOGNIZING ACTIVITIES WITH CLUSTER-TREES OF TRACKLETS 1 Recognizing activities with cluster-trees of tracklets (2015) (0)
- Bayesian Decision versus Voting for Image Retrieval (extended Version Accepted to Caip 1997) (1997) (0)
- Bilateral Contracts and Grants with Industry - MBDA Aerospatiale (2013) (0)
- C AN AN I MAGE C LASSIFIER S UFFICE FOR A CTION R ECOGNITION ? (2022) (0)
- Explorer Segmentation Propagation in ImageNet (2012) (0)
- New Results - Large-scale image search (2008) (0)
- The VOT 2015 and VOT-TIR 2015 Challenges Submission Report (2017) (0)
- Software and Platforms - DeepFlow (2013) (0)
- New Results - Semi-supervised learning and structuring of visual models (2008) (0)
- Bilateral Contracts and Grants with Industry - MSR-Inria joint lab: Image and video mining for science andhumanities (Inria) (2014) (0)
- Improving Image Recognition by Retrieving from Web-Scale Image-Text Data (2023) (0)
- 33 32 v 2 [ cs . C V ] 14 N ov 2 01 4 Convolutional Kernel Networks (2021) (0)
- New Results - Category-level object and scene recognition (2014) (0)
- Large-Scale Unsupervised Object Discovery – Supplementary Material – (2021) (0)
- Software - Visual Localization Demonstrator (2005) (0)
- Assembly Planning from Observations under Physical Constraints (2022) (0)
- Accurate Object Recognition with Shape Masks (2011) (0)
- Recognizing person interactions (2013) (0)
- Contracts and Grants with Industry - Technosens (2010) (0)
- Shape Recognition in Images (2010) (0)
- Automatic Activity Recognition for Video Surveillance (2020) (0)
- New Results - Learning and structuring of visual models (2010) (0)
- New Results - Supervised visual object recognition (2008) (0)
- Supplementary Material ViViT: A Video Vision Transformer (2021) (0)
- New Results - Human activity capture and classification (2011) (0)
- AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR (2023) (0)
- DeepMatching: Hierarchical Deformable Dense Matching (2016) (0)
- Software - Groups of adjacent contour segments (2006) (0)
- Software - Robust image correspondence and rapid recovery of specific objects and scene elements in image databases (2005) (0)
- Selection itérative de transformations pour la classification d'images (2014) (0)
- Software and Platforms - Large-scale image classification (2013) (0)
- INRIA-LEAR's Participation in ImageCLEF 2009 (2009) (0)
- Automatic Understanding of the Visual World (2019) (0)
- Attention Bottlenecks for Multimodal Fusion-Supplementary Materials (2021) (0)
- Other Grants and Activities - European Projects and Grants (2006) (0)
- Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control (2022) (0)
- Face Recognition from Caption-Based Supervision (2011) (0)
- New Results - Image descriptors and correspondence (2006) (0)
- Software - Large-scale image indexing: BigImBaz (2007) (0)
- Contact Models in Robotics: a Comparative Analysis (2023) (0)
- Software - Human detection software (2004) (0)
- Other Grants and Activities - National Projects (2008) (0)
- In Memoriam Roger Mohr (2017) (0)
- Editorial (2007) (0)
- gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction (2023) (0)
- New Results - Statistical modeling and machine learning for image analysis (2007) (0)
- Software - Object recognition demonstrator (2004) (0)
- Action Recognition (2021) (0)
- Learning Reward Functions for Robotic Manipulation by Observing Humans (2022) (0)
This paper list is powered by the following services:
What Schools Are Affiliated With Cordelia Schmid?
Cordelia Schmid is affiliated with the following schools: