#2,466

Most Influential Person

Canadian computer scientist

According to Wikipedia, Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms .

- Deep Learning (2015) (61305)
- Gradient-based learning applied to document recognition (1998) (39031)
- Generative Adversarial Nets (2014) (34498)
- Neural Machine Translation by Jointly Learning to Align and Translate (2014) (21547)
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (2014) (17093)
- Understanding the difficulty of training deep feedforward neural networks (2010) (13880)
- Representation Learning: A Review and New Perspectives (2012) (9605)
- Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014) (8549)
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (2015) (8100)
- Learning Deep Architectures for AI (2007) (7937)
- Graph Attention Networks (2017) (7433)
- Learning long-term dependencies with gradient descent is difficult (1994) (6682)
- Deep Sparse Rectifier Neural Networks (2011) (6627)
- A Neural Probabilistic Language Model (2003) (6527)
- How transferable are features in deep neural networks? (2014) (6436)
- Random Search for Hyper-Parameter Optimization (2012) (6358)
- Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion (2010) (6070)
- Extracting and composing robust features with denoising autoencoders (2008) (6010)
- On the Properties of Neural Machine Translation: Encoder–Decoder Approaches (2014) (4749)
- Convolutional networks for images, speech, and time series (1998) (4543)
- On the difficulty of training recurrent neural networks (2012) (4222)
- Pattern Recognition and Neural Networks (1995) (4194)
- Greedy Layer-Wise Training of Deep Networks (2006) (3992)
- Curriculum learning (2009) (3673)
- Algorithms for Hyper-Parameter Optimization (2011) (2846)
- FitNets: Hints for Thin Deep Nets (2014) (2369)
- BinaryConnect: Training Deep Neural Networks with binary weights during propagations (2015) (2315)
- Word Representations: A Simple and General Method for Semi-Supervised Learning (2010) (2247)
- Brain tumor segmentation with Deep Neural Networks (2015) (2220)
- Theano: A Python framework for fast computation of mathematical expressions (2016) (2195)
- Attention-Based Models for Speech Recognition (2015) (2080)
- Why Does Unsupervised Pre-training Help Deep Learning? (2010) (2019)
- Maxout Networks (2013) (1935)
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (1913)
- Practical Recommendations for Gradient-Based Training of Deep Architectures (2012) (1825)
- Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) (1792)
- Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies (2001) (1692)
- A Structured Self-attentive Sentence Embedding (2017) (1687)
- Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach (2011) (1670)
- Learning deep representations by mutual information estimation and maximization (2018) (1631)
- Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models (2015) (1542)
- Semi-supervised Learning by Entropy Minimization (2004) (1488)
- Binarized Neural Networks (2016) (1442)
- Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations (2016) (1424)
- NICE: Non-linear Independent Components Estimation (2014) (1399)
- Theano: new features and speed improvements (2012) (1396)
- Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (2011) (1329)
- The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation (2016) (1324)
- Deep Learning of Representations for Unsupervised and Transfer Learning (2011) (1201)
- Scaling learning algorithms towards AI (2007) (1183)
- Identifying and attacking the saddle point problem in high-dimensional non-convex optimization (2014) (1163)
- Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering (2003) (1093)
- Exploring Strategies for Training Deep Neural Networks (2009) (1082)
- Visualizing Higher-Layer Features of a Deep Network (2009) (1075)
- A Closer Look at Memorization in Deep Networks (2017) (1059)
- An empirical evaluation of deep architectures on problems with many factors of variation (2007) (1040)
- Deep Graph Infomax (2018) (1020)
- On the Number of Linear Regions of Deep Neural Networks (2014) (1013)
- A Recurrent Latent Variable Model for Sequential Data (2015) (975)
- Hierarchical Probabilistic Neural Network Language Model (2005) (973)
- End-to-end attention-based large vocabulary speech recognition (2015) (961)
- A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues (2016) (960)
- Challenges in representation learning: A report on three machine learning contests (2013) (944)
- HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (2018) (925)
- On Using Very Large Target Vocabulary for Neural Machine Translation (2014) (907)
- Inference for the Generalization Error (1999) (900)
- An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks (2013) (859)
- How to Construct Deep Recurrent Neural Networks (2013) (852)
- No Unbiased Estimator of the Variance of K-Fold Cross-Validation (2003) (845)
- Theano: A CPU and GPU Math Compiler in Python (2010) (839)
- Classification using discriminative restricted Boltzmann machines (2008) (830)
- Object Recognition with Gradient-Based Learning (1999) (824)
- Learning Structured Embeddings of Knowledge Bases (2011) (807)
- Representational Power of Restricted Boltzmann Machines and Deep Belief Networks (2008) (707)
- Gated Feedback Recurrent Neural Networks (2015) (699)
- Mutual Information Neural Estimation (2018) (667)
- Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription (2012) (651)
- Manifold Mixup: Better Representations by Interpolating Hidden States (2018) (627)
- BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (599)
- Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon (2018) (594)
- Deep Learning of Representations: Looking Forward (2013) (593)
- Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (2016) (571)
- Neural Probabilistic Language Models (2006) (564)
- A semantic matching energy function for learning with multi-relational data (2013) (564)
- Unitary Evolution Recurrent Neural Networks (2015) (547)
- Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism (2016) (538)
- Sharp Minima Can Generalize For Deep Nets (2017) (532)
- Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding (2015) (515)
- An Actor-Critic Algorithm for Sequence Prediction (2016) (515)
- Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives (2012) (504)
- MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis (2019) (501)
- Training deep neural networks with low precision multiplications (2014) (497)
- Understanding the exploding gradient problem (2012) (490)
- Convergence Properties of the K-Means Algorithms (1994) (490)
- SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (2016) (478)
- Advances in optimizing recurrent networks (2012) (477)
- Pointing the Unknown Words (2016) (476)
- On Using Monolingual Corpora in Neural Machine Translation (2015) (475)
- Hierarchical Multiscale Recurrent Neural Networks (2016) (472)
- On the Spectral Bias of Neural Networks (2018) (469)
- Deep Complex Networks (2017) (464)
- A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion (2015) (459)
- A deep learning framework for neuroscience (2019) (457)
- Understanding intermediate layers using linear classifier probes (2016) (452)
- Mode Regularized Generative Adversarial Networks (2016) (442)
- Professor Forcing: A New Algorithm for Training Recurrent Networks (2016) (441)
- Gradient-Based Optimization of Hyperparameters (2000) (437)
- Generalized Denoising Auto-Encoders as Generative Models (2013) (433)
- Speaker Recognition from Raw Waveform with SincNet (2018) (430)
- Benchmarking Graph Neural Networks (2020) (425)
- What regularized auto-encoders learn from the data-generating distribution (2012) (424)
- A Parallel Mixture of SVMs for Very Large Scale Problems (2001) (421)
- The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training (2009) (416)
- End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results (2014) (410)
- Generative adversarial networks (2020) (399)
- Zero-data Learning of New Tasks (2008) (397)
- Interpolation Consistency Training for Semi-Supervised Learning (2019) (393)
- Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding (2013) (393)
- Toward Causal Representation Learning (2021) (386)
- Char2Wav: End-to-End Speech Synthesis (2017) (385)
- BilBOWA: Fast Bilingual Distributed Representations without Word Alignments (2014) (383)
- Deep Generative Stochastic Networks Trainable by Backprop (2013) (373)
- An Input Output HMM Architecture (1994) (368)
- Tackling Climate Change with Machine Learning (2019) (367)
- Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks (2015) (363)
- Bayesian Model-Agnostic Meta-Learning (2018) (361)
- Hierarchical Recurrent Neural Networks for Long-Term Dependencies (1995) (353)
- EmoNets: Multimodal deep learning approaches for emotion recognition in video (2015) (351)
- Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing (2012) (347)
- Input-output HMMs for sequence processing (1996) (345)
- Three Factors Influencing Minima in SGD (2017) (337)
- Combining modality specific deep neural networks for emotion recognition in video (2013) (332)
- Generalization in Deep Learning (2017) (331)
- Incorporating Second-Order Functional Knowledge for Better Option Pricing (2000) (330)
- Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks (2016) (319)
- Kernel Matching Pursuit (2002) (318)
- A Character-level Decoder without Explicit Segmentation for Neural Machine Translation (2016) (315)
- Learning Eigenfunctions Links Spectral Embedding and Kernel PCA (2004) (311)
- Pylearn2: a machine learning research library (2013) (305)
- Gradient based sample selection for online continual learning (2019) (305)
- Shallow vs. Deep Sum-Product Networks (2011) (305)
- Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses (2017) (302)
- Revisiting Natural Gradient for Deep Networks (2013) (302)
- Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning (2018) (296)
- Neural Networks with Few Multiplications (2015) (295)
- N-BEATS: Neural basis expansion analysis for interpretable time series forecasting (2019) (293)
- Markovian Models for Sequential Data (2004) (290)
- Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation (2016) (289)
- Better Mixing via Deep Representations (2012) (287)
- Learning Algorithms for the Classification Restricted Boltzmann Machine (2012) (287)
- Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations (2016) (282)
- Boosting Neural Networks (2000) (281)
- High quality document image compression with "DjVu" (1998) (279)
- Towards Biologically Plausible Deep Learning (2015) (278)
- Learning a synaptic learning rule (1991) (276)
- Global optimization of a neural network-hidden Markov model hybrid (1991) (273)
- The Manifold Tangent Classifier (2011) (264)
- Equilibrated adaptive learning rates for non-convex optimization (2015) (263)
- MetaGAN: An Adversarial Approach to Few-Shot Learning (2018) (263)
- An Empirical Study of Example Forgetting during Deep Neural Network Learning (2018) (263)
- Difference Target Propagation (2014) (260)
- On the Expressive Power of Deep Architectures (2011) (260)
- Drawing and Recognizing Chinese Characters with Recurrent Neural Network (2016) (252)
- Theano: Deep Learning on GPUs with Python (2012) (245)
- ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks (2015) (245)
- On the Optimization of a Synaptic Learning Rule (2007) (245)
- Higher Order Contractive Auto-Encoder (2011) (244)
- RMSProp and equilibrated adaptive learning rates for non-convex optimization. (2015) (244)
- A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms (2019) (243)
- Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus (2016) (242)
- Learning deep physiological models of affect (2013) (226)
- Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark (2016) (225)
- K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms (2001) (222)
- Unsupervised and Transfer Learning Challenge: a Deep Learning Approach (2011) (219)
- Noisy Activation Functions (2016) (217)
- Justifying and Generalizing Contrastive Divergence (2009) (214)
- GMNN: Graph Markov Neural Networks (2019) (213)
- Recurrent Independent Mechanisms (2019) (212)
- ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation (2015) (207)
- The problem of learning long-term dependencies in recurrent networks (1993) (206)
- The Curse of Highly Variable Functions for Local Kernel Machines (2005) (205)
- Efficient Non-Parametric Function Induction in Semi-Supervised Learning (2004) (204)
- Speech Model Pre-training for End-to-End Spoken Language Understanding (2019) (204)
- Measuring the tendency of CNNs to Learn Surface Statistical Regularities (2017) (204)
- SpeechBrain: A General-Purpose Speech Toolkit (2021) (202)
- Dendritic cortical microcircuits approximate the backpropagation algorithm (2018) (201)
- Maximum-Likelihood Augmented Discrete Generative Adversarial Networks (2017) (201)
- Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model (2008) (199)
- On the number of response regions of deep feed forward networks with piece-wise linear activations (2013) (195)
- Disentangling Factors of Variation for Facial Expression Recognition (2012) (192)
- Light Gated Recurrent Units for Speech Recognition (2018) (192)
- A Deep Reinforcement Learning Chatbot (2017) (191)
- Learning normalized inputs for iterative estimation in medical image segmentation (2017) (189)
- Batch normalized recurrent neural networks (2015) (189)
- Experience Grounds Language (2020) (188)
- Topmoumoute Online Natural Gradient Algorithm (2007) (187)
- Unsupervised State Representation Learning in Atari (2019) (185)
- Image-to-image translation for cross-domain disentanglement (2018) (184)
- Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation (2016) (182)
- Blocks and Fuel: Frameworks for deep learning (2015) (177)
- Deep learning for AI (2021) (176)
- Multi-Task Self-Supervised Learning for Robust Speech Recognition (2020) (176)
- Convex Neural Networks (2005) (174)
- Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks (2019) (173)
- Improving Generative Adversarial Networks with Denoising Feature Matching (2016) (172)
- Hierarchical Neural Network Generative Models for Movie Dialogues (2015) (171)
- The Pytorch-kaldi Speech Recognition Toolkit (2018) (170)
- Low precision arithmetic for deep learning (2014) (169)
- Towards End-to-end Spoken Language Understanding (2018) (165)
- The Consciousness Prior (2017) (163)
- LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition (1995) (159)
- Z-Forcing: Training Stochastic Recurrent Networks (2017) (159)
- Artificial Neural Networks Applied to Taxi Destination Prediction (2015) (157)
- Predicting COVID-19 Pneumonia Severity on Chest X-ray With Deep Learning (2020) (153)
- Deep Learning for NLP (without Magic) (2012) (153)
- Learning to Understand Phrases by Embedding the Dictionary (2015) (152)
- Deep Belief Networks Are Compact Universal Approximators (2010) (151)
- Audio Chord Recognition with Recurrent Neural Networks (2013) (150)
- Knowledge Matters: Importance of Prior Information for Optimization (2013) (150)
- Neural networks for speech and sequence recognition (1996) (148)
- HeMIS: Hetero-Modal Image Segmentation (2016) (148)
- How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation (2014) (147)
- Manifold Parzen Windows (2002) (147)
- Variance Reduction in SGD by Distributed Importance Sampling (2015) (145)
- Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims (2020) (143)
- Architectural Complexity Measures of Recurrent Neural Networks (2016) (142)
- Montreal Neural Machine Translation Systems for WMT’15 (2015) (140)
- Boundary-Seeking Generative Adversarial Networks (2017) (140)
- Denoising Criterion for Variational Auto-Encoding Framework (2015) (138)
- Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks (2013) (137)
- Reweighted Wake-Sleep (2014) (137)
- Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews (2014) (136)
- On Multiplicative Integration with Recurrent Neural Networks (2016) (134)
- Multi-Prediction Deep Boltzmann Machines (2013) (132)
- Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks (1999) (130)
- Fine-grained attention mechanism for neural machine translation (2018) (126)
- Deep Learners Benefit More from Out-of-Distribution Examples (2011) (125)
- BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning (2018) (125)
- FigureQA: An Annotated Figure Dataset for Visual Reasoning (2017) (124)
- Marginalized Denoising Auto-encoders for Nonlinear Representations (2014) (123)
- InfoBot: Transfer and Exploration via the Information Bottleneck (2019) (122)
- Count-ception: Counting by Fully Convolutional Redundant Counting (2017) (119)
- Mining (2011) (119)
- Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines (2010) (118)
- Deep Directed Generative Models with Energy-Based Probability Estimation (2016) (118)
- Global training of document processing systems using graph transformer networks (1997) (117)
- A Neural Knowledge Language Model (2016) (114)
- Model Selection for Small Sample Regression (2002) (113)
- Inductive biases for deep learning of higher-level cognition (2020) (112)
- Learning Neural Causal Models from Unknown Interventions (2019) (111)
- Iterative Alternating Neural Attention for Machine Reading (2016) (110)
- BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop (2018) (110)
- Neural net language models (2008) (109)
- Gated Orthogonal Recurrent Units: On Learning to Forget (2017) (108)
- Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio (2011) (106)
- Understanding Representations Learned in Deep Architectures (2010) (102)
- Large-Scale Feature Learning With Spike-and-Slab Sparse Coding (2012) (97)
- Collaborative Filtering on a Family of Biological Targets (2006) (96)
- On integrating a language model into neural machine translation (2017) (96)
- Non-Local Manifold Tangent Learning (2004) (96)
- Taking on the curse of dimensionality in joint distributions using neural networks (2000) (96)
- Revisiting Fundamentals of Experience Replay (2020) (95)
- Deep Learning for Patient-Specific Kidney Graft Survival Analysis (2017) (94)
- An empirical analysis of dropout in piecewise linear networks (2013) (94)
- A Spike and Slab Restricted Boltzmann Machine (2011) (92)
- End-to-End Online Writer Identification With Recurrent Neural Network (2017) (92)
- On the saddle point problem for non-convex optimization (2014) (92)
- Modeling term dependencies with quantum language models for IR (2013) (91)
- Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models (1993) (90)
- Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study (2019) (89)
- Feature-wise transformations (2018) (89)
- Deconstructing the Ladder Network Architecture (2015) (89)
- Recurrent Neural Networks for Missing or Asynchronous Data (1995) (88)
- Multi-Task Learning for Stock Selection (1996) (88)
- A hybrid Pareto model for asymmetric fat-tailed data: the univariate case (2009) (86)
- On the number of inference regions of deep feed forward networks with piece-wise linear activations (2013) (86)
- The need for privacy with public digital contact tracing during the COVID-19 pandemic (2020) (85)
- Disentangling Factors of Variation via Generative Entangling (2012) (85)
- Residual Connections Encourage Iterative Inference (2017) (85)
- Quickly Generating Representative Samples from an RBM-Derived Process (2011) (84)
- BPS: a learning algorithm for capturing the dynamic nature of speech (1989) (83)
- The Curse of Dimensionality for Local Kernel Machines (2005) (83)
- Unsupervised Models of Images by Spikeand-Slab RBMs (2011) (82)
- Gradient Starvation: A Learning Proclivity in Neural Networks (2020) (81)
- Estimating or Propagating Gradients Through Stochastic Neurons (2013) (81)
- A Walk with SGD (2018) (81)
- Context-dependent word representation for neural machine translation (2016) (80)
- ChatPainter: Improving Text to Image Generation using Dialogue (2018) (80)
- Multi-way, multilingual neural machine translation (2017) (79)
- A Generative Process for sampling Contractive Auto-Encoders (2012) (78)
- ObamaNet: Photo-realistic lip-sync from text (2017) (78)
- Maximum Entropy Generators for Energy-Based Models (2019) (77)
- BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices (2019) (77)
- Learning Independent Features with Adversarial Nets for Non-linear ICA (2017) (77)
- Straight to the Tree: Constituency Parsing with Neural Syntactic Distance (2018) (76)
- Parallel Tempering for Training of Restricted Boltzmann Machines (2010) (75)
- Quaternion Recurrent Neural Networks (2018) (75)
- Combined Reinforcement Learning via Abstract Representations (2018) (74)
- Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation (2014) (74)
- Spectral Clustering and Kernel PCA are Learning Eigenfunctions (2003) (73)
- Wasserstein Dependency Measure for Representation Learning (2019) (73)
- Big Neural Networks Waste Capacity (2013) (73)
- On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length (2018) (73)
- Deep convolutional networks for quality assessment of protein folds (2018) (72)
- Word-level training of a handwritten word recognizer based on convolutional neural networks (1994) (72)
- Adding noise to the input of a model trained with a regularized objective (2011) (72)
- Learning Speaker Representations with Mutual Information (2018) (71)
- Slow, Decorrelated Features for Pretraining Complex Cell-like Networks (2009) (71)
- Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction (2018) (71)
- Interpretable Convolutional Filters with SincNet (2018) (71)
- Compositional generalization in a deep seq2seq model by separating syntax and semantics (2019) (71)
- Using a Financial Training Criterion Rather than a Prediction Criterion (1997) (70)
- Deep Learning of Representations (2013) (70)
- Learning to Compute Word Embeddings On the Fly (2017) (70)
- Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer (2018) (70)
- CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning (2020) (70)
- Label Propagation and Quadratic Criterion (2006) (70)
- Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling (2020) (69)
- Recurrent Neural Networks With Limited Numerical Precision (2016) (69)
- Hyperbolic Discounting and Learning over Multiple Horizons (2019) (68)
- Entropy Regularization (2006) (68)
- Toward Training Recurrent Neural Networks for Lifelong Learning (2018) (68)
- Training Methods for Adaptive Boosting of Neural Networks (1997) (68)
- Independently Controllable Factors (2017) (67)
- STDP-Compatible Approximation of Backpropagation in an Energy-Based Model (2017) (66)
- Greedy Spectral Embedding (2005) (66)
- Hierarchical Memory Networks (2016) (65)
- DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS (2010) (65)
- Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding (2018) (64)
- Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition (2018) (64)
- High-dimensional sequence transduction (2012) (63)
- Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery (2012) (61)
- Independently Controllable Features (2017) (61)
- Invariant Representations for Noisy Speech Recognition (2016) (61)
- CLOSURE: Assessing Systematic Generalization of CLEVR Models (2019) (60)
- Non-Local Manifold Parzen Windows (2005) (60)
- Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization (2021) (59)
- Embedding Word Similarity with Neural Machine Translation (2014) (59)
- Incorporating Functional Knowledge in Neural Networks (2009) (58)
- Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy (2019) (58)
- Depth with Nonlinearity Creates No Bad Local Minima in ResNets (2018) (57)
- Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning (2020) (57)
- Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition (2017) (57)
- Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online (2012) (57)
- Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes (2016) (57)
- The Z-coder adaptive binary coder (1998) (56)
- Reading checks with multilayer graph transformer networks (1997) (56)
- Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization (2019) (56)
- Continuous Neural Networks (2007) (55)
- Recall Traces: Backtracking Models for Efficient Reinforcement Learning (2018) (55)
- On the Spectral Bias of Deep Neural Networks (2018) (55)
- Hybrid Models for Learning to Branch (2020) (54)
- Torchmeta: A Meta-Learning library for PyTorch (2019) (54)
- A Connectionist Approach to Speech Recognition (1993) (54)
- GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning (2019) (54)
- Artificial neural networks and their application to sequence recognition (1991) (54)
- Use of genetic programming for the search of a new learning rule for neural networks (1994) (53)
- Credit Assignment through Time: Alternatives to Backpropagation (1993) (53)
- HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery (2020) (53)
- Systematic generalisation with group invariant predictions (2021) (52)
- Cost functions and model combination for VaR-based asset allocation using neural networks (2001) (52)
- Twin Networks: Matching the Future for Sequence Generation (2017) (52)
- Spectral Dimensionality Reduction (2006) (52)
- Extensions to Metric-Based Model Selection (2003) (52)
- Bias learning, knowledge sharing (2003) (52)
- Memory Augmented Neural Networks with Wormhole Connections (2017) (52)
- Learning Anonymized Representations with Adversarial Neural Networks (2018) (51)
- Disentangling the independently controllable factors of variation by interacting with the world (2018) (51)
- STDP as presynaptic activity times rate of change of postsynaptic activity (2015) (51)
- Diffusion of Context and Credit Information in Markovian Models (1995) (51)
- Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation (2021) (51)
- Not All Neural Embeddings are Born Equal (2014) (49)
- On the Challenges of Physical Implementations of RBMs (2013) (49)
- AdaBoosting Neural Networks: Application to on-line Character Recognition (1997) (49)
- Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization (2014) (49)
- Low precision storage for deep learning (2014) (48)
- Selective small molecule peptidomimetic ligands of TrkC and TrkA receptors afford discrete or complete neurotrophic activities. (2005) (47)
- Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net (2017) (47)
- Experiments on the application of IOHMMs to model financial returns series (2001) (47)
- The representational geometry of word meanings acquired by neural machine translation models (2017) (46)
- RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs (2020) (46)
- Use machine learning to find energy materials (2017) (46)
- Iterative Neural Autoregressive Distribution Estimator NADE-k (2014) (45)
- Learning the dynamic nature of speech with back-propagation for sequences (1992) (45)
- Improving Speech Recognition by Revising Gated Recurrent Units (2017) (44)
- DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning (2020) (44)
- Adversarial Domain Adaptation for Stable Brain-Machine Interfaces (2018) (44)
- Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics (2019) (44)
- Dendritic error backpropagation in deep cortical microcircuits (2017) (44)
- On Adversarial Mixup Resynthesis (2019) (43)
- Nonlocal Estimation of Manifold Structure (2006) (43)
- Diet Networks: Thin Parameters for Fat Genomics (2016) (43)
- Large-Scale Learning of Embeddings with Reconstruction Sampling (2011) (42)
- Scaling Large Learning Problems with Hard Parallel Mixtures (2002) (42)
- Graph Neural Networks with Learnable Structural and Positional Representations (2021) (42)
- Coordination Among Neural Modules Through a Shared Global Workspace (2021) (41)
- Learning Tags that Vary Within a Song (2010) (41)
- Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules (2020) (41)
- Bias in Estimating the Variance of K-Fold Cross-Validation (2005) (41)
- Word normalization for on-line handwritten word recognition (1994) (41)
- Neural Production Systems (2021) (41)
- Variational Temporal Abstraction (2019) (40)
- Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models (2019) (40)
- Introduction to the special issue on neural networks for data mining and knowledge discovery (2000) (40)
- Early Inference in Energy-Based Models Approximates Back-Propagation (2015) (40)
- Towards a Biologically Plausible Backprop (2016) (40)
- Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations (2018) (39)
- Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems (2020) (39)
- A network of deep neural networks for Distant Speech Recognition (2017) (39)
- DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation (2018) (39)
- 11 Label Propagation and Quadratic Criterion (39)
- On the interplay between noise and curvature and its effect on optimization and generalization (2019) (39)
- Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes (2018) (38)
- GSNs : Generative Stochastic Networks (2015) (38)
- Use machine learning to find energy materials. (2017) (37)
- Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask (2017) (37)
- GraphMix: Improved Training of GNNs for Semi-Supervised Learning (2020) (37)
- Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies (2020) (37)
- Scaling Up Spike-and-Slab Models for Unsupervised Feature Learning (2013) (36)
- ReSeg: A Recurrent Neural Network for Object Segmentation (2015) (36)
- Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models (1993) (36)
- Mollifying Networks (2016) (35)
- Training End-to-End Analog Neural Networks with Equilibrium Propagation (2020) (35)
- Inherent privacy limitations of decentralized contact tracing apps (2020) (35)
- Task Loss Estimation for Sequence Prediction (2015) (34)
- Quadratic Features and Deep Architectures for Chunking (2009) (34)
- Evolving Culture Versus Local Minima (2014) (34)
- Probabilistic Planning with Sequential Monte Carlo methods (2018) (34)
- Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies (2019) (34)
- InfoMask: Masked Variational Latent Representation to Localize Chest Disease (2019) (33)
- Representation Mixing for TTS Synthesis (2018) (33)
- Object-Centric Image Generation from Layouts (2020) (33)
- Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future (2019) (33)
- Editorial introduction to the Neural Networks special issue on Deep Learning of Representations (2015) (33)
- Batch-normalized joint training for DNN-based distant speech recognition (2016) (32)
- Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures (2019) (32)
- Online continual learning with no task boundaries (2019) (32)
- On the Learning Dynamics of Deep Neural Networks (2018) (32)
- Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing Its Gradient Estimator Bias (2020) (32)
- Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models (2004) (32)
- Small-GAN: Speeding Up GAN Training Using Core-sets (2019) (32)
- Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives (2019) (32)
- Evolving Culture vs Local Minima (2012) (31)
- Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs (2010) (31)
- An EM Algorithm for Asynchronous Input/Output Hidden Markov Models (1996) (31)
- Contextual tag inference (2011) (31)
- Fraternal Dropout (2017) (31)
- Learning Eigenfunctions of Similarity: Linking Spectral Clustering and Kernel PCA (2003) (31)
- Universal Successor Representations for Transfer Reinforcement Learning (2018) (31)
- Equivalence of Equilibrium Propagation and Recurrent Backpropagation (2017) (31)
- Interactive Language Learning by Question Answering (2019) (30)
- Adaptive Drift-Diffusion Process to Learn Time Intervals (2011) (30)
- Convolutional neural networks for mesh-based parcellation of the cerebral cortex (2018) (30)
- Meta-learning framework with applications to zero-shot time-series forecasting (2020) (30)
- On the search for new learning rules for ANNs (1995) (30)
- Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines (2013) (30)
- Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions (2012) (29)
- HNHN: Hypergraph Networks with Hyperedge Neurons (2020) (29)
- Manifold Mixup: Learning Better Representations by Interpolating Hidden States (2018) (28)
- DEUP: Direct Epistemic Uncertainty Prediction (2021) (28)
- Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning (2014) (28)
- A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies (2018) (28)
- Joint Training of Deep Boltzmann Machines (2012) (27)
- h-detach: Modifying the LSTM Gradient Towards Better Optimization (2018) (27)
- Speech and Speaker Recognition from Raw Waveform with SincNet (2018) (27)
- Statistical Learning Algorithms Applied to Automobile Insurance Ratemaking (2003) (27)
- Finding Flatter Minima with SGD (2018) (27)
- Building Musically-relevant Audio Features through Multiple Timescale Representations (2012) (27)
- Quick Training of Probabilistic Neural Nets by Importance Sampling (2003) (27)
- Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information (2018) (27)
- Saliency is a Possible Red Herring When Diagnosing Poor Generalization (2021) (27)
- Discriminative Non-negative Matrix Factorization for Multiple Pitch Estimation (2012) (27)
- On Tracking The Partition Function (2011) (27)
- Regularized Auto-Encoders Estimate Local Statistics (2012) (27)
- An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming (2021) (27)
- Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference (2001) (26)
- Towards Gene Expression Convolutions using Gene Interaction Graphs (2018) (26)
- Tractable Multivariate Binary Density Estimation and the Restricted Boltzmann Forest (2010) (26)
- Programmable execution of multi-layered networks for automatic speech recognition (1989) (26)
- How to Initialize your Network? Robust Initialization for WeightNorm & ResNets (2019) (26)
- Brain Inspired Reinforcement Learning (2004) (26)
- The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions (2014) (25)
- Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input (2019) (25)
- Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning (2021) (25)
- Commonsense mining as knowledge base completion? A study on the impact of novelty (2018) (25)
- Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks (2019) (25)
- Bayesian Structure Learning with Generative Flow Networks (2022) (24)
- Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks (1991) (24)
- Rethinking Distributional Matching Based Domain Adaptation (2020) (24)
- Learning from Partial Labels with Minimum Entropy (2004) (24)
- A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions (2009) (24)
- Alternative time representation in dopamine models (2010) (23)
- Gradient-based Learning Applied to Document Recognition Gt Graph Transformer. Gtn Graph Transformer Network. Hmm Hidden Markov Model. Hos Heuristic Oversegmentation. K-nn K-nearest Neighbor. Nn Neural Network. Ocr Optical Character Recognition. Pca Principal Component Analysis. Rbf Radial Basis Func (1998) (23)
- Generalization of Equilibrium Propagation to Vector Field Dynamics (2018) (23)
- Word normalization for online handwritten word recognition (1994) (23)
- The Causal-Neural Connection: Expressiveness, Learnability, and Inference (2021) (23)
- Focused Hierarchical RNNs for Conditional Sequence Processing (2018) (23)
- Joint Training Deep Boltzmann Machines for Classification (2013) (23)
- Learning from unexpected events in the neocortical microcircuit (2021) (22)
- Bounding the Test Log-Likelihood of Generative Models (2013) (22)
- Bidirectional Helmholtz Machines (2015) (22)
- On the Iterative Refinement of Densely Connected Representation Levels for Semantic Segmentation (2018) (22)
- Learning Neural Causal Models with Active Interventions (2021) (22)
- Robust Regression with Asymmetric Heavy-Tail Noise Distributions (2002) (22)
- Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks (2020) (22)
- Discriminative feature and model design for automatic speech recognition (1997) (22)
- The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach (2018) (21)
- Diffusion of Credit in Markovian Models (1994) (21)
- Augmented Functional Time Series Representation and Forecasting with Gaussian Processes (2007) (21)
- Efficient EM Training of Gaussian Mixtures with Missing Data (2012) (21)
- Learning Causal Models Online (2020) (21)
- Deriving Differential Target Propagation from Iterating Approximate Inverses (2020) (21)
- A robust adaptive stochastic gradient method for deep learning (2017) (20)
- Attention Based Pruning for Shift Networks (2019) (20)
- Phonetically-based multi-layered neural networks for vowel classification (1990) (20)
- On Training Recurrent Neural Networks for Lifelong Learning (2018) (20)
- Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs (2013) (20)
- Topic Segmentation : A First Stage to Dialog-Based Information Extraction (2001) (20)
- Locally Linear Embedding for dimensionality reduction in QSAR (2004) (20)
- On the challenge of learning complex functions. (2007) (20)
- Learning the 2-D Topology of Images (2007) (20)
- Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay (2020) (20)
- Support vector machines for improving the classification of brain PET images (1998) (20)
- Variational Causal Networks: Approximate Bayesian Inference over Causal Structures (2021) (20)
- A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition (2019) (20)
- An EM approach to grammatical inference: input/output HMMs (1994) (19)
- MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation (2018) (19)
- Deep Self-Taught Learning for Handwritten Character Recognition (2010) (19)
- Improving First and Second-Order Methods by Modeling Uncertainty (2010) (19)
- PROC OF THE IEEE NOVEMBER Gradient Based Learning Applied to Document Recognition (2006) (18)
- Perceptual Generative Autoencoders (2019) (18)
- Natural Gradient Revisited (2013) (18)
- Browsing through high quality document images with DjVu (1998) (18)
- Fast and Slow Learning of Recurrent Independent Mechanisms (2021) (18)
- Modeling the Long Term Future in Model-Based Reinforcement Learning (2018) (18)
- How does hemispheric specialization contribute to human-defining cognition? (2021) (18)
- Generative Flow Networks for Discrete Probabilistic Modeling (2022) (18)
- Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning (2018) (18)
- On Training Deep Boltzmann Machines (2012) (18)
- Diet Networks: Thin Parameters for Fat Genomic (2016) (18)
- Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible (2016) (17)
- Unsupervised Learning of Semantics of Object Detections for Scene Categorization (2013) (17)
- Problems in the deployment of machine-learned models in health care (2021) (17)
- Towards Standardization of Data Licenses: The Montreal Data License (2019) (17)
- GradMask: Reduce Overfitting by Regularizing Saliency (2019) (17)
- Discrete-Valued Neural Communication (2021) (17)
- Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge (1989) (17)
- The effects of negative adaptation in Model-Agnostic Meta-Learning (2018) (16)
- Autotagging music with conditional restricted Boltzmann machines (2011) (16)
- Information matrices and generalization (2019) (16)
- How can deep learning advance computational modeling of sensory information processing? (2018) (16)
- Equilibrium Propagation with Continual Weight Updates (2019) (16)
- Variational Bi-LSTMs (2017) (16)
- ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient (2014) (16)
- Chunked Autoregressive GAN for Conditional Waveform Synthesis (2021) (16)
- A Neural Support Vector Network architecture with adaptive kernels (2000) (16)
- The Benefits of Over-parameterization at Initialization in Deep ReLU Networks (2019) (16)
- Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network (1990) (16)
- Keep Drawing It: Iterative language-based image generation and editing (2018) (16)
- Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio (2018) (16)
- Continuous optimization of hyper-parameters (2000) (15)
- Reinforcement Learning for Sustainable Agriculture (2019) (15)
- Use of neural networks for the recognition of place of articulation (1988) (15)
- On the Morality of Artificial Intelligence (2019) (15)
- The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget (2020) (15)
- Trajectory Balance: Improved Credit Assignment in GFlowNets (2022) (15)
- Deep Directed Generative Autoencoders (2014) (15)
- Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks (2017) (14)
- Twin Regularization for online speech recognition (2018) (14)
- Generalizable Features From Unsupervised Learning (2016) (14)
- Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2018) (14)
- RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design (2020) (14)
- Twin Networks: Using the Future as a Regularizer (2017) (14)
- Transformers with Competitive Ensembles of Independent Mechanisms (2021) (14)
- Compositional Generalization by Factorizing Alignment and Translation (2020) (14)
- Learning semantic representations of objects and their parts (2014) (14)
- An EM Approach to Learning Sequential (1994) (14)
- Biological Sequence Design with GFlowNets (2022) (14)
- Learned-norm pooling for deep neural networks (2013) (14)
- COVI White Paper (2020) (14)
- The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All) (2011) (13)
- DETONATION CLASSIFICATION FROM ACOUSTIC SIGNATURE WITH THE RESTRICTED BOLTZMANN MACHINE (2012) (13)
- A memory-efficient adaptive Huffman coding algorithm for very large sets of symbols (1998) (13)
- A hybrid coder for hidden Markov models using a recurrent neural networks (1990) (13)
- NU-GAN: High resolution neural upsampling with GAN (2020) (13)
- Input decay: simple and effective soft variable selection (2001) (13)
- Extending the Framework of Equilibrium Propagation to General Dynamics (2018) (13)
- Reinforced Imitation in Heterogeneous Action Space (2019) (13)
- Generalization of a Parametric Learning Rule (1993) (13)
- The First Conversational Intelligence Challenge (2018) (13)
- An objective function for STDP (2015) (13)
- Learning invariant features through local space contraction (2011) (13)
- GibbsNet: Iterative Adversarial Inference for Deep Graphical Models (2017) (12)
- Target Propagation (2015) (12)
- S2RMs: Spatially Structured Recurrent Modules (2020) (12)
- Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences (2018) (12)
- Plan, Attend, Generate: Planning for Sequence-to-Sequence Models (2017) (12)
- How Transferable Are Features in Convolutional Neural Network Acoustic Models across Languages? (2019) (12)
- Dynamic Inference with Neural Interpreters (2021) (12)
- Universal Successor Features for Transfer Reinforcement Learning (2018) (12)
- Statistical Machine Learning Algorithms for Target Classification from Acoustic Signature (2009) (11)
- Image Segmentation by Iterative Inference from Conditional Score Estimation (2017) (11)
- Scaling up deep learning (2014) (11)
- The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis (2019) (11)
- HighRes-net: Multi-Frame Super-Resolution by Recursive Fusion (2019) (11)
- Conditioning and time representation in long short-term memory networks (2014) (11)
- Missing Data with Recurrent Networks Handling Asynchronous or Missing Data with Recurrent Networks (1998) (11)
- On Catastrophic Interference in Atari 2600 Games (2020) (11)
- BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization (2020) (11)
- Boundary Seeking GANs (2018) (11)
- Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders (2012) (11)
- Locally Weighted Full Covariance Gaussian Density Estimation (2004) (10)
- A learning-based algorithm to quickly compute good primal solutions for Stochastic Integer Programs (2019) (10)
- Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning (2021) (10)
- A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (2021) (10)
- Unsupervised and Transfer Learning under Uncertainty - From Object Detections to Scene Categorization (2013) (10)
- Multimodal Transitions for Generative Stochastic Networks (2013) (10)
- Predicting Infectiousness for Proactive Contact Tracing (2020) (10)
- A Deep Reinforcement Learning Chatbot (Short Version) (2018) (10)
- Dynamic Frame Skipping for Fast Speech Recognition in Recurrent Neural Network Based Acoustic Models (2018) (10)
- Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach (2020) (10)
- Weakly-supervised Knowledge Graph Alignment with Adversarial Learning (2019) (9)
- On the Equivalence between Deep NADE and Generative Stochastic Networks (2014) (9)
- Automated segmentation of cortical layers in BigBrain reveals divergent cortical and laminar thickness gradients in sensory and motor cortices. (2019) (9)
- Deep learning and cultural evolution (2014) (9)
- Generalization in Machine Learning via Analytical Learning Theory (2018) (9)
- The Octopus Approach to the Alexa Competition : A Deep Ensemble-based Socialbot (2017) (8)
- Compositional Attention: Disentangling Search and Retrieval (2021) (8)
- Untangling tradeoffs between recurrence and self-attention in artificial neural networks (2020) (8)
- Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (8)
- Iteratively unveiling new regions of interest in Deep Learning models (2018) (8)
- Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation (1991) (8)
- A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM (2020) (8)
- A3T: Adversarially Augmented Adversarial Training (2018) (8)
- How to construct deep recurrent neural networks: Proceedings of the Second International Conference on Learning Representations (ICLR 2014) (2014) (8)
- Untangling tradeoffs between recurrence and self-attention in neural networks (2020) (8)
- Unifying Generative Models with GFlowNets (2022) (8)
- TRAINING A NEURAL NETWORK WITH A FINANCIAL CRITERION RATHER THAN A PREDICTION CRITERION (2007) (8)
- Machines Who Learn. (2016) (8)
- Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy (2019) (8)
- Multiscale sequence modeling with a learned dictionary (2017) (8)
- An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism (2009) (8)
- Oracle Performance for Visual Captioning (2015) (8)
- Modularity Matters: Learning Invariant Relational Reasoning Tasks (2018) (7)
- Découpage thématique des conversations : un outil d'aide à l'extraction (2002) (7)
- Deep Learning for Automatic Summary Scoring (2012) (7)
- Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments (2021) (7)
- GFlowNet Foundations (2021) (7)
- A Generative Process for Contractive Auto-Encoders (2012) (7)
- Is a Modular Architecture Enough? (2022) (7)
- An Analysis of the Adaptation Speed of Causal Models (2020) (7)
- Discovering Shared Structure in Manifold Learning (2004) (7)
- Conditional Computation for Continual Learning (2019) (7)
- A simple and general method for semi-supervised learning (2010) (7)
- ACtuAL: Actor-Critic Under Adversarial Learning (2017) (7)
- Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks (2020) (7)
- Multi-scale Feature Learning Dynamics: Insights for Double Descent (2021) (7)
- Non-parametric Regression between Riemannian Manifolds (2009) (7)
- Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning (2017) (6)
- Learning the Arrow of Time for Problems in Reinforcement Learning (2020) (6)
- A Dataset of Topic-Oriented Human-to-Chatbot Dialogues (2018) (6)
- COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing (2020) (6)
- Suitability of V1 Energy Models for Object Classification (2011) (6)
- Structured Sparsity Inducing Adaptive Optimizers for Deep Learning (2021) (6)
- FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters (2021) (6)
- On random weights for texture generation in one layer CNNS (2017) (6)
- From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence (2021) (6)
- Big Data: Theoretical Aspects [Scanning the Issue] (2016) (6)
- Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition (1988) (6)
- CACHE (Critical Assessment of Computational Hit-finding Experiments): A public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding (2021) (6)
- Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination (2002) (6)
- Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya (2020) (6)
- NYU-MILA Neural Machine Translation Systems for WMT’16 (2016) (6)
- Variance Regularizing Adversarial Learning (2017) (6)
- Use of Multi-Layered Networks for Coding Speech with Phonetic Features (1988) (5)
- Discussion of "The Neural Autoregressive Distribution Estimator" (2011) (5)
- Shared Context Probabilistic Transducers (1997) (5)
- Mastering Rate based Curriculum Learning (2020) (5)
- On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties (1989) (5)
- On-line handwriting recognition with neural networks: Spatial representation versus temporal representation (1993) (5)
- Towards Understanding Generalization via Analytical Learning Theory (2018) (5)
- On the Morality of Artificial Intelligence [Commentary] (2020) (5)
- Learning Powerful Policies by Using Consistent Dynamics Model (2019) (5)
- Modeling Natural Image Covariance with a Spike and Slab Restricted Boltzmann Machine (2010) (5)
- Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings (2011) (5)
- Understanding deep architectures and the effect of unsupervised pre-training (2011) (5)
- Binary pseudowavelets and applications to bilevel image processing (1999) (5)
- hBERT + BiasCorp - Fighting Racism on the Web (2021) (5)
- Systematicity in a Recurrent Neural Network by Factorizing Syntax and Semantics (2020) (5)
- Generating Multiscale Amorphous Molecular Structures Using Deep Learning: A Study in 2D. (2020) (5)
- Combining Model-based and Model-free RL via Multi-step Control Variates (2018) (5)
- Deep Tempering (2014) (5)
- Underwhelming Generalization Improvements From Controlling Feature Attribution (2019) (5)
- Towards Scaling Difference Target Propagation by Learning Backprop Targets (2022) (5)
- Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution (2022) (5)
- Learning from Learning Machines: Optimisation, Rules, and Social Norms (2019) (4)
- Weakly Supervised Representation Learning with Sparse Perturbations (2022) (4)
- Establishing an evaluation metric to quantify climate change image realism (2019) (4)
- Using Simulated Data to Generate Images of Climate Change (2020) (4)
- Use of multilayer networks for the recognition of phonetic features and phonemes (1989) (4)
- On Out-of-Sample Statistics for Time-Series (2002) (4)
- Forecasting and Trading Commodity Contract Spreads with Gaussian Processes (2007) (4)
- Probabilistic neural network models for sequential data (2000) (4)
- ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods (2021) (4)
- Ghost Units Yield Biologically Plausible Backprop in Deep Neural Networks (2019) (4)
- Using Artificial Intelligence to Visualize the Impacts of Climate Change (2021) (4)
- A Hybrid Pareto Model for Conditional Density Estimation of Asymmetric Fat-Tail Data (2007) (4)
- Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond (2021) (4)
- Cross-Modal Information Maximization for Medical Imaging: CMIM (2020) (4)
- Combating False Negatives in Adversarial Imitation Learning (2020) (4)
- Empirical performance upper bounds for image and video captioning (2015) (4)
- The Variational Walkback Algorithm (2016) (4)
- A Two-Stream Continual Learning System With Variational Domain-Agnostic Feature Replay (2021) (4)
- Trainable performance upper bounds for image and video captioning (2015) (4)
- A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions (2022) (4)
- Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers (2020) (4)
- Learning GFlowNets from partial episodes for improved convergence and stability (2022) (4)
- RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro (2022) (4)
- Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL (2022) (4)
- Discrete Key-Value Bottleneck (2022) (4)
- Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition (2020) (3)
- Visual Concept Reasoning Networks (2020) (3)
- RetroGNN: Fast Estimation of Synthesizability for Virtual Screening and De Novo Design by Learning from Slow Retrosynthesis Software (2022) (3)
- CMIM: Cross-Modal Information Maximization For Medical Imaging (2021) (3)
- A Hybrid Pareto Model for Asymmetric Fat-Tail Data (2006) (3)
- Training Bidirectional Helmholtz Machines (2015) (3)
- Exploration-Driven Representation Learning in Reinforcement Learning (2021) (3)
- The Journey is the Reward: Unsupervised Learning of Influential Trajectories (2019) (3)
- GFlowNets and variational inference (2022) (3)
- Connectionist Models and their Application to Automatic Speech Recognition (1991) (3)
- Unsupervised one-to-many image translation (2018) (3)
- Continuous-Time Meta-Learning with Forward Mode Differentiation (2022) (3)
- Valorisation d'Options par Optimisation du Sharpe Ratio (2002) (3)
- Learning to rank for censored survival data (2018) (3)
- COVI White Paper-Version 1.1 (2020) (3)
- State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations (2019) (3)
- Reinforced Imitation Learning from Observations (2018) (3)
- Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization (2007) (3)
- Graph-Based Semi-Supervised Learning (2005) (3)
- An Actor-Critic Algorithm for Structured Prediction (2016) (3)
- Stochastic Learning of Strategic Equilibria for Auctions (1999) (3)
- Deep Architectures for Baby AI (2007) (3)
- Large-Scale Algorithms (2006) (3)
- Blocks and Fuel (2015) (3)
- Supplementary material for : How transferable are features in deep neural networks ? (2014) (3)
- A Walk with SGD: How SGD Explores Regions of Deep Network Loss? (2018) (3)
- Predicting Unreliable Predictions by Shattering a Neural Network (2021) (3)
- From STDP towards Biologically Plausible Deep Learning (2015) (3)
- Learning the Arrow of Time (2019) (3)
- On the Generalization and Adaption Performance of Causal Models (2022) (3)
- Applying Knowledge Transfer for Water Body Segmentation in Peru (2019) (3)
- Distributed Representation Prediction for Generalization to New Words (2006) (2)
- Learning Simple Non Stationarities with Hyper Parameters (1999) (2)
- Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models (2021) (2)
- Gaussian Mixture Densities for Classification of Nuclear Power Plant Data (1998) (2)
- On Random Weights for Texture Generation in One Layer Neural Networks (2016) (2)
- USE OF NEURAL NETWORKS FOR THE RECOGNITION OF PLACE (1988) (2)
- Deep Learning. Das umfassende Handbuch (2018) (2)
- MEMORY-EFFICIENT ADAPTIVE HUFFMAN CODING (1998) (2)
- Training opposing directed models using geometric mean matching (2015) (2)
- Document Analysis with Transducers (2015) (2)
- Radial Basis Functions for Speech Recognition (1992) (2)
- Multi-Task Learning For Option Pricing (2002) (2)
- Spatially Structured Recurrent Modules (2021) (2)
- Agnostic Physics-Driven Deep Learning (2022) (2)
- Comparative Study of Learning Outcomes for Online Learning Platforms (2021) (2)
- Unifying Likelihood-free Inference with Black-box Optimization and Beyond (2021) (2)
- Workshop summary: Workshop on learning feature hierarchies (2009) (2)
- Joint Learning of Generative Translator and Classifier for Visually Similar Classes (2019) (2)
- Speech coding with multilayer networks (1989) (2)
- Low-memory convolutional neural networks through incremental depth-first processing (2018) (2)
- Task Loss Estimation for Structured Prediction (2016) (2)
- Generalizing to a zero-data task : a computational chemistry case study (2006) (2)
- Régularisation du prix des options : Stacking (2002) (2)
- Estimators of Variance for K-Fold Cross-Validation (2003) (2)
- SGD Smooths The Sharpest Directions (2018) (2)
- Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code (2017) (2)
- Predictive Inference with Feature Conformal Prediction (2022) (2)
- A Common GPU n-Dimensional Array for Python and C (2011) (2)
- Extracting Hidden Sense Probabilities from Bitexts (2003) (2)
- Sparse Attentive Backtracking : Towards Efficient Credit Assignment In Recurrent Networks (2017) (2)
- Apprentissage machine efficace: theorie et pratique (2012) (2)
- Latent Bottlenecked Attentive Neural Processes (2022) (1)
- Deep Learning for NLP (without Magic) References (2012) (1)
- AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N (2022) (1)
- Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features (2021) (1)
- Comment améliorer la capacité de généralisation des algorithmes d'apprentissage pour la prise de décisions financières (2003) (1)
- On Out-of-Sample Statistics for Financial Time-Series (2002) (1)
- Étude du biais dans le prix des options (2002) (1)
- Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization (2022) (1)
- Extended Semantic Tagging for Entity Extraction (1)
- The Effect of Diversity in Meta-Learning (2022) (1)
- Metric-based model selection for time-series forecasting (2002) (1)
- Markovian Models for Sequential (2004) (1)
- GraphMix: Improved Training of Graph Neural Networks for Semi-Supervised Learning (2020) (1)
- J un 2 01 3 Deep Learning of Representations : Looking Forward (2013) (1)
- Segmentation en thèmes de conversations téléphoniques : traitement en amont pour l’extraction d’information (2002) (1)
- Statistical Language and Speech Processing (2013) (1)
- Rethinking Learning Dynamics in RL using Adversarial Networks (2022) (1)
- Pattern Recognition (1998) (1)
- Exploring the Wasserstein metric for time-to-event analysis (2021) (1)
- Automated curriculum generation for Policy Gradients from Demonstrations (2019) (1)
- Avoidance Learning Using Observational Reinforcement Learning (2019) (1)
- Generative Augmented Flow Networks (2022) (1)
- InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck (2019) (1)
- Predicting ice flow using machine learning (2019) (1)
- Speech coding with multi-layer networks (1989) (1)
- Mode Regularized Generative Adversarial (2016) (1)
- Codon arrangement modulates MHC-I peptides presentation (2020) (1)
- Convergence Properties of Deep Neural Networks on Separable Data (2018) (1)
- A Neural Network to Detect Homologies in Proteins (1989) (1)
- Convergence Properties of the K-means Algorithms L Eon Bottou (1995) (1)
- Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007 (2008) (1)
- Introduction to NIPS 2017 Competition Track (2018) (1)
- Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks (2020) (1)
- Lookback for Learning to Branch (2022) (1)
- Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning (2022) (1)
- Sharp Minima Can Generalize For Deep Nets Supplementary Material (2017) (1)
- Incorporating complex cells into neural networks for pattern classification (2011) (1)
- Towards the Latent Transcriptome (2018) (1)
- Building Robust Ensembles via Margin Boosting (2022) (1)
- Approche statistique pour le repérage de mots informatifs dans les textes oraux (2004) (1)
- Gaussian Mixtures with Missing Data: an Ecien t EM Training Algorithm (1994) (1)
- Coordinating Policies Among Multiple Agents via an Intelligent Communication Channel (2022) (0)
- Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning (2022) (0)
- Learning Classical Planning Transition Functions by Deep Neural Networks (2020) (0)
- Learning semantic representations of objects and their parts (2013) (0)
- SUPPLEMENTARY MATERIAL-LEARNING TO NAVIGATE THE SYNTHETICALLY ACCESSIBLE CHEMICAL SPACE USING REINFORCEMENT LEARNING (2020) (0)
- FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for Federated Learning on Non-IID Data (2022) (0)
- Artificial Intelligence Based Cloud Distributor (AI-CD): Probing Low Cloud Distribution with Generative Adversarial Neural Networks (2019) (0)
- Artificial Intelligence Pioneers But making those quantum leaps from science fiction to reality required hard work from computer scientists like (0)
- Pruning for efficient hardware implementations of deep neural networks (2020) (0)
- Extended Abstract Track Object-Centric Causal Representation Learning (2022) (0)
- Robust and Controllable Object-Centric Learning through Energy-based Models (2022) (0)
- Latent State Marginalization as a Low-cost Approach for Improving Exploration (2022) (0)
- Posterior samples of source galaxies in strong gravitational lenses with score-based priors (2022) (0)
- Information Fusion in Deep Convolutional Neural Networks for Biomedical Image Segmentation 1 (2018) (0)
- Université de Montréal Estimating the probability of a fleet vehicle accident: A deep learning approach using Conditional Variational Auto-Encoders (2020) (0)
- UOUS AND DISCRETE ADDRESSING SCHEMES (2016) (0)
- Neural Production Systems: Learning Rule-Governed Visual Dynamics (2021) (0)
- BabyAI 1.1 (2020) (0)
- Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning (2020) (0)
- Machine Learning (2021) (0)
- Contrastive introspection (ConSpec) to rapidly identify invariant prototypes for success in RL (2022) (0)
- Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes (2022) (0)
- A General Purpose Neural Architecture for Geospatial Systems (2022) (0)
- Estimation de densité conditionnelle lorsque l'hypothèse de normalité est insatisfaisante (2004) (0)
- Proposed Architectural and Representational Modifications (2021) (0)
- L EARNING THE A RROW OF T IME FOR P ROBLEMS IN R EINFORCEMENT L EARNING (2020) (0)
- EVALUATING LONG-TERM DEPENDENCYBENCHMARK PROBLEMS BY RANDOM GUESSINGJ (2001) (0)
- The Challenge of Non-Linear Regression on Large Datasets with Asymmetric Heavy Tails (2002) (0)
- Extending Metric-Based Model Selection and Regularization in the Absence of Unlabeled Data (0)
- Learning powerful policies and better dynamics models by encouraging consistency (2018) (0)
- Estimating Car Insuran e Premia : a Case Study in High-Dimensional (2013) (0)
- F IT N ETS : H INTS FOR T HIN D EEP N ETS (2015) (0)
- SCANNING THE ISSUE Big Data : Theoretical Aspects (2015) (0)
- Stacked calibration of off-policy policy evaluation for video game matchmaking (2013) (0)
- Synergies Between Disentanglement and Sparsity: a Multi-Task Learning Perspective (2022) (0)
- Depthwith nonlinearity creates no bad localminima in ResNets (2019) (0)
- IAPR keynote lecture IV: Deep learning (2015) (0)
- Image-to-image Mapping with Many Domains by Sparse Attribute Transfer (2020) (0)
- SGD S MOOTHS THE S HARPEST D IRECTIONS (2018) (0)
- 18 Large-Scale Algorithms (0)
- Learning of Sophisticated Curriculums by viewing them as Graphs over Tasks (2018) (0)
- Former NASA chief unveils $ 100 million neural chip maker KnuEdge (2016) (0)
- {COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (0)
- How do We Train Deep Architectures ? (2009) (0)
- Optimization of Artificial Neural Network Hyperparameters For Processing Retrospective Information (2021) (0)
- Problèmes associés au déploiement des modèles fondés sur l’apprentissage machine en santé (2021) (0)
- Part I Feature Extraction Fundamentals 11 Ensembles of Regularized Least Squares Classifiers for High-dimensional Problems 15 Tree-based Ensembles with Dynamic Soft Feature Selection 18 Bayesian Support Vector Machines for Feature Ranking and Selection 21 Feature Selection via Sensitivity Analysis w (0)
- An Energy-Based Recurrent Neural Network for Multiple Fundamental Frequency Estimation (2011) (0)
- The K Best-Paths Approach to Approximate Dynamic Programming with Application to Portfolio Optimization (2006) (0)
- Learning Neural Generative Dynamics for Molecular Conformation Generation (2021) (0)
- Recurrent Neural Networks for Adaptive Temporal ProcessingYoshua Bengio (1993) (0)
- O ct 2 01 9 S MALL-GAN : S PEEDING UP GAN T RAINING USING C ORES ETS (2019) (0)
- Proceedings of the 22nd International Conference on Neural Information Processing Systems (2009) (0)
- A survey on recent activation functions with emphasis on oscillating activation functions (2022) (0)
- Conditioning and time representation in long short-term memory networks (2013) (0)
- FL Games: A federated learning framework for distribution shifts (2022) (0)
- LATTER M INIMA WITH SGD (2018) (0)
- Generalization to a zero-data task: an empirical study (0)
- Learning Generative Models with Locally Disentangled Latent Factors (2018) (0)
- Reassuring and Troubling Views on Graph-Based Semi-Supervised Learning (2005) (0)
- Collaborative filtering techniques for drug discovery par 7 M / t ( 3 ’ / 7 (2016) (0)
- The representational geometry of word meanings acquired by neural machine translation models (2017) (0)
- Generalization (2020) (0)
- Meta Attention Networks: Meta Learning Attention To Modulate Information Between Sparsely Interacting Recurrent Modules (2020) (0)
- Combating False Negatives in Adversarial Imitation Learning (Student Abstract) (2020) (0)
- Multi-Domain Balanced Sampling Improves Out-of- Generalization of Chest X-ray Pathology Prediction Models (2021) (0)
- Marathi Handwritten Numeral Recognition using Zernike Moments and Fourier Descriptors (2020) (0)
- Equivariance with Learned Canonicalization Functions (2022) (0)
- Deep Meditations : Controlled navigation of latent space (2018) (0)
- On the Optimization of a Synaptic LearningRuleSamy (1997) (0)
- Model Sele tion for Small Sample (2000) (0)
- Generalization of a Parametric LearningRule (1993) (0)
- Supplemental Material for : Deep Generative Stochastic Networks Trainable by Backprop (2014) (0)
- Graph Priors for Deep Neural Networks (2018) (0)
- Aprendizaje profundo. Tras años de decepciones, la inteligencia artiñcial está empezando a cumplir lo que prometia en sus comienzos gracias a esta potente técnica (2016) (0)
- On learning distributed representations of semantics (2011) (0)
- Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One (2022) (0)
- Forecasting Non-Stationary Volatility with Hyper-Parameters (2002) (0)
- BigBrain: 1D convolutional neural networks for automated sementation of cortical layers (2018) (0)
- Pen-based visitor registration system (PENGUIN) (1994) (0)
- PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design (2022) (0)
- Towards more hardware-friendly deep learning (2017) (0)
- WARDS BETTER OPTIMIZATION (2019) (0)
- Artificial Intelligence Cytometer in Blood (2019) (0)
- The AI Driving Olympics at NIPS 2018 (0)
- MIREX TAGGING CONTEST : A DEEP NEURAL NET APPROACH ( DRAFT ) (2008) (0)
- Learning Latent Multiscale Structure Using Recurrent Neural Networks (2016) (0)
- Proposed Algorithm : Algorithm (2007) (0)
- 2 The Curse of Dimensionality for Classical Non-Parametric Models (0)
- On summarized validation curves and generalization (2019) (0)
- »Deep Learning ist keine Religion« (2018) (0)
- Les données au service du savoir (2017) (0)
- A comparative study on hybrid acoustic phonetic decoders based on artificial neural networks (1991) (0)
- On the Use of an Ear Model and Multi-Layered Networks for Automatic Speech Recognition (1990) (0)
- Markovian Models for Sequential DataYoshua (1996) (0)
- PAST DSAA KEYNOTE SPEAKERS (2020) (0)
- Contrastive introspection (ConSpec) to rapidly identify invariant steps for success (2022) (0)
- TRANSFER REINFORCEMENT LEARNING (2018) (0)
- Stochastic Gradient Descent on a Portfolio Management Training Criterion Using the IPA Gradient Estimator (2003) (0)
- SPECTRA: Sparse Entity-centric Transitions (2019) (0)
- Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection (2022) (0)
- E VALUATING G ENERALIZATION IN GF LOW N ETS FOR M OLECULE D ESIGN (2022) (0)
- Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning (2022) (0)
- I NDUCTIVE B IASES FOR R ELATIONAL T ASKS (2022) (0)
- Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs (2020) (0)
- Leveraging the Third Dimension in Contrastive Learning (2022) (0)
- L G ] 2 9 D ec 2 01 8 Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2019) (0)
- A semantic matching energy function for learning with multi-relational data (2013) (0)
- Bayesian Structure Learning with Generative Flow Networks (Supplementary material) (2022) (0)
- Repérage de mots informatifs dans les textes conversationnels (2004) (0)
- Interventional Causal Representation Learning (2022) (0)
- Multi-Objective GFlowNets (2022) (0)
- GFlowOut: Dropout with Generative Flow Networks (2022) (0)
- M L ] 2 0 A ug 2 01 3 Pylearn 2 : a machine learning research library (2014) (0)
- Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation (2020) (0)
- VIM: Variational Independent Modules for Video Prediction (2022) (0)
- (Private)-Retroactive Carbon Pricing [(P)ReCaP]: A Market-based Approach for Climate Finance and Risk Assessment (2022) (0)
- CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation (2020) (0)
- MAgNet: Mesh Agnostic Neural PDE Solver (2022) (0)
- IGURE QA : A N A NNOTATED F IGURE D ATASET FOR V ISUAL R EASONING (2018) (0)
- O BJECT - CENTRIC C OMPOSITIONAL I MAGINATION FOR V ISUAL A BSTRACT R EASONING (2022) (0)
- EnGAN: Latent Space MCMC and Maximum Entropy Generators for Energy-based Models (2018) (0)
- Exploring the Wasserstein metric for survival analysis (2021) (0)
- Neural Attentive Circuits (2022) (0)
- Université de Montréal Balancing Signals for Semi-Supervised Sequence Learning (2020) (0)
- Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions (2022) (0)
- Episodes Meta Sequence S 2 Fast Update Slow Update Fast Update Slow Update (2021) (0)
- Bayesian Dynamic Causal Discovery (2022) (0)
- Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints (2022) (0)
- RNNLOGIC: LEARNING LOGIC RULES FOR REASON- (2020) (0)
- Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization (2022) (0)
- Proceedings of the 21st International Conference on Neural Information Processing Systems (2008) (0)
- On Neural Architecture Inductive Biases for Relational Tasks (2022) (0)
- Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning (2022) (0)

This paper list is powered by the following services:

Yoshua Bengio is affiliated with the following schools:

This website uses cookies to enhance the user experience. Read the Privacy Policy for more.

Subscribe To Newsletter?Yes!