Why Is Yoshua Bengio Influential?
According to Wikipedia , Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms .
Yoshua Bengio's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
1990 2000 2010 2020 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 55000 60000 65000 70000 75000 Published Papers Gradient-based learning applied to document recognition (1998) (33368)Generative Adversarial Nets (2014) (27159)Deep Learning (2015) (27040)Neural Machine Translation by Jointly Learning to Align and Translate (2015) (18167)Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (2014) (14069)Understanding the difficulty of training deep feedforward neural networks (2010) (11707)Representation Learning: A Review and New Perspectives (2013) (8271)Learning Deep Architectures for AI (2007) (7376)Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (2015) (6882)Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014) (6853)Learning long-term dependencies with gradient descent is difficult (1994) (5898)A Neural Probabilistic Language Model (2000) (5775)Deep Sparse Rectifier Neural Networks (2011) (5564)How transferable are features in deep neural networks? (2014) (5448)Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion (2010) (5412)Random Search for Hyper-Parameter Optimization (2012) (5285)Extracting and composing robust features with denoising autoencoders (2008) (5200)Graph Attention Networks (2018) (4626)Pattern Recognition and Neural Networks (1995) (3890)Convolutional networks for images, speech, and time series (1998) (3857)On the Properties of Neural Machine Translation: Encoder–Decoder Approaches (2014) (3815)On the difficulty of training recurrent neural networks (2013) (3592)Greedy Layer-Wise Training of Deep Networks (2006) (3204)Curriculum learning (2009) (2971)Algorithms for Hyper-Parameter Optimization (2011) (2299)Word Representations: A Simple and General Method for Semi-Supervised Learning (2010) (2115)Theano: A Python framework for fast computation of mathematical expressions (2016) (2039)BinaryConnect: Training Deep Neural Networks with binary weights during propagations (2015) (1996)Brain tumor segmentation with Deep Neural Networks (2017) (1867)FitNets: Hints for Thin Deep Nets (2015) (1803)Attention-Based Models for Speech Recognition (2015) (1782)Maxout Networks (2013) (1766)Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (1629)Why Does Unsupervised Pre-training Help Deep Learning? (2010) (1607)Practical Recommendations for Gradient-Based Training of Deep Architectures (2012) (1596)Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach (2011) (1556)Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies (2001) (1489)A Structured Self-attentive Sentence Embedding (2017) (1388)Theano: new features and speed improvements (2012) (1368)Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) (1351)Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models (2016) (1344)Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (2011) (1210)Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations (2017) (1197)Binarized Neural Networks (2016) (1180)Semi-supervised Learning by Entropy Minimization (2004) (1150)The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation (2017) (1128)Scaling learning algorithms towards AI (2007) (1095)Learning deep representations by mutual information estimation and maximization (2019) (1092)NICE: Non-linear Independent Components Estimation (2015) (1059)Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering (2003) (1057)Deep Learning of Representations for Unsupervised and Transfer Learning (2012) (1049)Identifying and attacking the saddle point problem in high-dimensional non-convex optimization (2014) (1028)Exploring Strategies for Training Deep Neural Networks (2009) (1006)An empirical evaluation of deep architectures on problems with many factors of variation (2007) (948)Visualizing Higher-Layer Features of a Deep Network (2009) (931)Hierarchical Probabilistic Neural Network Language Model (2005) (921)On the Number of Linear Regions of Deep Neural Networks (2014) (879)End-to-end attention-based large vocabulary speech recognition (2016) (863)Inference for the Generalization Error (2004) (837)On Using Very Large Target Vocabulary for Neural Machine Translation (2015) (832)A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues (2017) (823)Theano: A CPU and GPU Math Compiler in Python (2010) (818)A Recurrent Latent Variable Model for Sequential Data (2015) (818)Classification using discriminative restricted Boltzmann machines (2008) (796)How to Construct Deep Recurrent Neural Networks (2014) (772)No Unbiased Estimator of the Variance of K-Fold Cross-Validation (2003) (740)A Closer Look at Memorization in Deep Networks (2017) (736)Learning Structured Embeddings of Knowledge Bases (2011) (730)Object Recognition with Gradient-Based Learning (1999) (683)Challenges in representation learning: A report on three machine learning contests (2013) (657)Representational Power of Restricted Boltzmann Machines and Deep Belief Networks (2008) (655)An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks (2014) (648)Gated Feedback Recurrent Neural Networks (2015) (622)HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (2018) (621)Deep Graph Infomax (2019) (620)Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription (2012) (588)Neural Probabilistic Language Models (2006) (545)BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (532)Deep Learning of Representations: Looking Forward (2013) (519)A semantic matching energy function for learning with multi-relational data (2013) (497)Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (2017) (494)Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives (2012) (473)Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding (2015) (472)Unitary Evolution Recurrent Neural Networks (2016) (468)Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism (2016) (457)Convergence Properties of the K-Means Algorithms (1994) (449)Training deep neural networks with low precision multiplications (2014) (443)Mutual Information Neural Estimation (2018) (441)An Actor-Critic Algorithm for Sequence Prediction (2017) (441)Understanding the exploding gradient problem (2012) (431)Advances in optimizing recurrent networks (2013) (429)Pointing the Unknown Words (2016) (423)Sharp Minima Can Generalize For Deep Nets (2017) (418)Hierarchical Multiscale Recurrent Neural Networks (2017) (417)On Using Monolingual Corpora in Neural Machine Translation (2015) (414)SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (2017) (411)A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion (2015) (411)Generalized Denoising Auto-Encoders as Generative Models (2013) (401)A Parallel Mixture of SVMs for Very Large Scale Problems (2002) (391)What regularized auto-encoders learn from the data-generating distribution (2014) (384)The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training (2009) (383)Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding (2013) (380)Mode Regularized Generative Adversarial Networks (2017) (378)Gradient-Based Optimization of Hyperparameters (2000) (369)Manifold Mixup: Better Representations by Interpolating Hidden States (2019) (366)Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon (2021) (365)Professor Forcing: A New Algorithm for Training Recurrent Networks (2016) (364)End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results (2014) (359)An Input Output HMM Architecture (1994) (354)BilBOWA: Fast Bilingual Distributed Representations without Word Alignments (2015) (352)Deep Generative Stochastic Networks Trainable by Backprop (2014) (342)Char2Wav: End-to-End Speech Synthesis (2017) (339)A deep learning framework for neuroscience (2019) (339)Input-output HMMs for sequence processing (1996) (338)Zero-data Learning of New Tasks (2008) (333)Understanding intermediate layers using linear classifier probes (2017) (331)Deep Complex Networks (2018) (326)Hierarchical Recurrent Neural Networks for Long-Term Dependencies (1995) (324)Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks (2015) (323)Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing (2012) (318)Kernel Matching Pursuit (2004) (318)EmoNets: Multimodal deep learning approaches for emotion recognition in video (2015) (310)Learning Eigenfunctions Links Spectral Embedding and Kernel PCA (2004) (304)Combining modality specific deep neural networks for emotion recognition in video (2013) (302)A Character-level Decoder without Explicit Segmentation for Neural Machine Translation (2016) (297)MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis (2019) (292)Speaker Recognition from Raw Waveform with SincNet (2018) (287)Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks (2016) (282)Markovian Models for Sequential Data (2004) (281)Bayesian Model-Agnostic Meta-Learning (2018) (280)High quality document image compression with "DjVu" (1998) (276)Incorporating Second-Order Functional Knowledge for Better Option Pricing (2000) (276)Shallow vs. Deep Sum-Product Networks (2011) (272)Boosting Neural Networks (2000) (267)Neural Networks with Few Multiplications (2016) (265)Generalization in Deep Learning (2017) (264)Global optimization of a neural network-hidden Markov model hybrid (1992) (263)On the Spectral Bias of Neural Networks (2019) (263)Benchmarking Graph Neural Networks (2020) (262)Tackling Climate Change with Machine Learning (2019) (261)Revisiting Natural Gradient for Deep Networks (2014) (260)Learning Algorithms for the Classification Restricted Boltzmann Machine (2012) (259)Three Factors Influencing Minima in SGD (2017) (258)Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations (2017) (258)Interpolation Consistency Training for Semi-Supervised Learning (2019) (256)Towards Biologically Plausible Deep Learning (2015) (256)Better Mixing via Deep Representations (2013) (256)Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses (2017) (255)Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning (2018) (253)The Manifold Tangent Classifier (2011) (247)Theano: Deep Learning on GPUs with Python (2012) (232)Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation (2017) (231)Equilibrated adaptive learning rates for non-convex optimization (2015) (226)Learning a synaptic learning rule (1991) (225)On the Expressive Power of Deep Architectures (2011) (225)RMSProp and equilibrated adaptive learning rates for non-convex optimization. (2015) (224)Higher Order Contractive Auto-Encoder (2011) (222)K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms (2001) (215)Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus (2016) (213)ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks (2015) (213)Difference Target Propagation (2015) (212)Learning deep physiological models of affect (2013) (210)On the Optimization of a Synaptic Learning Rule (2007) (207)Unsupervised and Transfer Learning Challenge: a Deep Learning Approach (2012) (206)MetaGAN: An Adversarial Approach to Few-Shot Learning (2018) (205)Justifying and Generalizing Contrastive Divergence (2009) (203)Drawing and Recognizing Chinese Characters with Recurrent Neural Network (2018) (202)Efficient Non-Parametric Function Induction in Semi-Supervised Learning (2005) (201)The Curse of Highly Variable Functions for Local Kernel Machines (2005) (195)Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark (2017) (194)Noisy Activation Functions (2016) (190)The problem of learning long-term dependencies in recurrent networks (1993) (190)Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model (2008) (184)ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation (2016) (176)Disentangling Factors of Variation for Facial Expression Recognition (2012) (174)Maximum-Likelihood Augmented Discrete Generative Adversarial Networks (2017) (173)Topmoumoute Online Natural Gradient Algorithm (2007) (172)A Deep Reinforcement Learning Chatbot (2017) (171)Challenges in Representation Learning: A Report on Three Machine Learning Contests (2013) (171)Batch normalized recurrent neural networks (2016) (170)A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms (2020) (167)Measuring the tendency of CNNs to Learn Surface Statistical Regularities (2017) (166)Blocks and Fuel: Frameworks for deep learning (2015) (165)Gradient based sample selection for online continual learning (2019) (163)Convex Neural Networks (2005) (163)Learning normalized inputs for iterative estimation in medical image segmentation (2018) (162)Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation (2017) (162)Low precision arithmetic for deep learning (2015) (161)Improving Generative Adversarial Networks with Denoising Feature Matching (2017) (160)GMNN: Graph Markov Neural Networks (2019) (160)On the number of response regions of deep feed forward networks with piece-wise linear activations (2013) (155)Hierarchical Neural Network Generative Models for Movie Dialogues (2015) (154)Image-to-image translation for cross-domain disentanglement (2018) (154)Dendritic cortical microcircuits approximate the backpropagation algorithm (2018) (153)LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition (1995) (150)An Empirical Study of Example Forgetting during Deep Neural Network Learning (2019) (147)Neural networks for speech and sequence recognition (1996) (146)N-BEATS: Neural basis expansion analysis for interpretable time series forecasting (2020) (145)Deep Belief Networks Are Compact Universal Approximators (2010) (143)Manifold Parzen Windows (2002) (142)Toward Causal Representation Learning (2021) (140)Knowledge Matters: Importance of Prior Information for Optimization (2016) (138)Speech Model Pre-training for End-to-End Spoken Language Understanding (2019) (137)Audio Chord Recognition with Recurrent Neural Networks (2013) (135)Light Gated Recurrent Units for Speech Recognition (2018) (135)Artificial Neural Networks Applied to Taxi Destination Prediction (2015) (134)Deep Learning for NLP (without Magic) (2012) (134)The Pytorch-kaldi Speech Recognition Toolkit (2019) (132)Montreal Neural Machine Translation Systems for WMT’15 (2015) (131)HeMIS: Hetero-Modal Image Segmentation (2016) (130)On Multiplicative Integration with Recurrent Neural Networks (2016) (129)Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews (2015) (128)Towards End-to-end Spoken Language Understanding (2018) (127)Z-Forcing: Training Stochastic Recurrent Networks (2017) (127)The Consciousness Prior (2017) (126)Recurrent Independent Mechanisms (2021) (125)Architectural Complexity Measures of Recurrent Neural Networks (2016) (125)Multi-Prediction Deep Boltzmann Machines (2013) (124)How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation (2014) (124)Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks (2019) (123)Boundary-Seeking Generative Adversarial Networks (2017) (123)Predicting COVID-19 Pneumonia Severity on Chest X-ray With Deep Learning (2020) (121)Learning to Understand Phrases by Embedding the Dictionary (2016) (120)Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks (1999) (119)Reweighted Wake-Sleep (2015) (119)Unsupervised State Representation Learning in Atari (2019) (117)Marginalized Denoising Auto-encoders for Nonlinear Representations (2014) (117)Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks (2014) (116)Global training of document processing systems using graph transformer networks (1997) (115)Variance Reduction in SGD by Distributed Importance Sampling (2015) (115)Deep Learners Benefit More from Out-of-Distribution Examples (2011) (114)Experience Grounds Language (2020) (113)Denoising Criterion for Variational Auto-Encoding Framework (2017) (113)Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines (2010) (111)Model Selection for Small Sample Regression (2004) (106)Deep Directed Generative Models with Energy-Based Probability Estimation (2016) (106)Multi-Task Self-Supervised Learning for Robust Speech Recognition (2020) (106)Iterative Alternating Neural Attention for Machine Reading (2016) (105)Label Propagation and Quadratic Criterion (2006) (103)A Neural Knowledge Language Model (2016) (103)Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio (2011) (102)Fine-grained attention mechanism for neural machine translation (2018) (101)Neural net language models (2008) (100)InfoBot: Transfer and Exploration via the Information Bottleneck (2019) (100)FigureQA: An Annotated Figure Dataset for Visual Reasoning (2018) (99)Count-ception: Counting by Fully Convolutional Redundant Counting (2017) (99)On the number of inference regions of deep feed forward networks with piece-wise linear activations (2014) (97)Understanding Representations Learned in Deep Architectures (2010) (97)Collaborative Filtering on a Family of Biological Targets (2006) (96)Non-Local Manifold Tangent Learning (2004) (93)Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims (2020) (92)Large-Scale Feature Learning With Spike-and-Slab Sparse Coding (2012) (91)Gated Orthogonal Recurrent Units: On Learning to Forget (2019) (90)Taking on the curse of dimensionality in joint distributions using neural networks (2000) (88)A Spike and Slab Restricted Boltzmann Machine (2011) (86)Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models (1993) (84)BPS: a learning algorithm for capturing the dynamic nature of speech (1989) (83)An empirical analysis of dropout in piecewise linear networks (2014) (82)Disentangling Factors of Variation via Generative Entangling (2012) (81)Multi-Task Learning for Stock Selection (1996) (81)Quickly Generating Representative Samples from an RBM-Derived Process (2011) (80)On the saddle point problem for non-convex optimization (2014) (80)BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning (2019) (79)On integrating a language model into neural machine translation (2017) (78)End-to-End Online Writer Identification With Recurrent Neural Network (2017) (78)Deep Learning for Patient-Specific Kidney Graft Survival Analysis (2017) (78)Unsupervised Models of Images by Spikeand-Slab RBMs (2011) (78)Recurrent Neural Networks for Missing or Asynchronous Data (1995) (77)Deconstructing the Ladder Network Architecture (2016) (77)Modeling term dependencies with quantum language models for IR (2013) (76)A hybrid Pareto model for asymmetric fat-tailed data: the univariate case (2009) (75)Feature-wise transformations (2018) (74)The Curse of Dimensionality for Local Kernel Machines (2005) (74)Parallel Tempering for Training of Restricted Boltzmann Machines (2010) (72)A Generative Process for sampling Contractive Auto-Encoders (2012) (72)Spectral Clustering and Kernel PCA are Learning Eigenfunctions (2003) (72)Multi-way, multilingual neural machine translation (2017) (68)Word-level training of a handwritten word recognizer based on convolutional neural networks (1994) (68)ChatPainter: Improving Text to Image Generation using Dialogue (2018) (68)Deep convolutional networks for quality assessment of protein folds (2018) (68)Estimating or Propagating Gradients Through Stochastic Neurons (2013) (68)Residual Connections Encourage Iterative Inference (2018) (68)Slow, Decorrelated Features for Pretraining Complex Cell-like Networks (2009) (67)Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study (2019) (67)The need for privacy with public digital contact tracing during the COVID-19 pandemic (2020) (66)Deep Learning of Representations (2013) (66)Greedy Spectral Embedding (2005) (66)Training Methods for Adaptive Boosting of Neural Networks (1997) (65)Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer (2018) (65)Big Neural Networks Waste Capacity (2013) (64)Recurrent Neural Networks With Limited Numerical Precision (2016) (63)Deep learning for AI (2021) (63)Context-dependent word representation for neural machine translation (2017) (62)Independently Controllable Factors (2017) (62)Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation (2014) (62)Adding noise to the input of a model trained with a regularized objective (2011) (62)Learning to Compute Word Embeddings On the Fly (2017) (62)Inductive Biases for Deep Learning of Higher-Level Cognition (2020) (62)A Walk with SGD (2018) (62)Straight to the Tree: Constituency Parsing with Neural Syntactic Distance (2018) (62)Learning Independent Features with Adversarial Nets for Non-linear ICA (2018) (62)Maximum Entropy Generators for Energy-Based Models (2019) (61)DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS (2010) (60)Quaternion Recurrent Neural Networks (2019) (60)Entropy Regularization (2006) (60)Hierarchical Memory Networks (2016) (59)Using a Financial Training Criterion Rather than a Prediction Criterion (1997) (59)Embedding Word Similarity with Neural Machine Translation (2015) (59)Wasserstein Dependency Measure for Representation Learning (2019) (59)Learning Neural Causal Models from Unknown Interventions (2019) (59)Generative adversarial networks (2020) (58)High-dimensional sequence transduction (2013) (58)Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery (2012) (58)ObamaNet: Photo-realistic lip-sync from text (2018) (58)Interpretable Convolutional Filters with SincNet (2018) (58)STDP-Compatible Approximation of Backpropagation in an Energy-Based Model (2017) (57)Combined Reinforcement Learning via Abstract Representations (2019) (57)Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding (2018) (57)Non-Local Manifold Parzen Windows (2005) (57)Learning Speaker Representations with Mutual Information (2019) (56)STDP as presynaptic activity times rate of change of postsynaptic activity (2015) (55)On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length (2019) (55)Independently Controllable Features (2017) (55)The Z-coder adaptive binary coder (1998) (54)Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition (2018) (54)Invariant Representations for Noisy Speech Recognition (2016) (54)Artificial neural networks and their application to sequence recognition (1991) (53)Continuous Neural Networks (2007) (53)Use machine learning to find energy materials (2017) (53)Use machine learning to find energy materials. (2017) (52)Reading checks with multilayer graph transformer networks (1997) (52)Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online (2012) (51)Cost functions and model combination for VaR-based asset allocation using neural networks (2001) (51)Incorporating Functional Knowledge in Neural Networks (2009) (51)Credit Assignment through Time: Alternatives to Backpropagation (1993) (51)Use of genetic programming for the search of a new learning rule for neural networks (1994) (51)Depth with Nonlinearity Creates No Bad Local Minima in ResNets (2019) (50)Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes (2016) (50)Spectral Dimensionality Reduction (2006) (50)Extensions to Metric-Based Model Selection (2003) (50)Toward Training Recurrent Neural Networks for Lifelong Learning (2020) (49)A Connectionist Approach to Speech Recognition (1993) (49)Memory Augmented Neural Networks with Wormhole Connections (2017) (48)BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices (2020) (48)Diffusion of Context and Credit Information in Markovian Models (1995) (48)Hyperbolic Discounting and Learning over Multiple Horizons (2019) (48)On the Spectral Bias of Deep Neural Networks (2018) (48)Compositional generalization in a deep seq2seq model by separating syntax and semantics (2019) (48)Learning Anonymized Representations with Adversarial Neural Networks (2018) (48)Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling (2020) (47)Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition (2017) (47)Low precision storage for deep learning (2014) (47)Recall Traces: Backtracking Models for Efficient Reinforcement Learning (2019) (46)Not All Neural Embeddings are Born Equal (2014) (46)Learning the dynamic nature of speech with back-propagation for sequences (1992) (45)The representational geometry of word meanings acquired by neural machine translation models (2017) (45)Revisiting Fundamentals of Experience Replay (2020) (45)Iterative Neural Autoregressive Distribution Estimator NADE-k (2014) (44)Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization (2014) (44)Selective small molecule peptidomimetic ligands of TrkC and TrkA receptors afford discrete or complete neurotrophic activities. (2005) (44)Disentangling the independently controllable factors of variation by interacting with the world (2018) (44)Experiments on the application of IOHMMs to model financial returns series (2001) (44)Dendritic error backpropagation in deep cortical microcircuits (2018) (44)GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning (2019) (43)Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy (2019) (43)AdaBoosting Neural Networks: Application to on-line Character Recognition (1997) (43)Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction (2019) (43)Nonlocal Estimation of Manifold Structure (2006) (42)Twin Networks: Matching the Future for Sequence Generation (2018) (42)On the Challenges of Physical Implementations of RBMs (2014) (41)Large-Scale Learning of Embeddings with Reconstruction Sampling (2011) (41)Word normalization for on-line handwritten word recognition (1994) (41)Diet Networks: Thin Parameters for Fat Genomics (2017) (41)Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning (2020) (41)CLOSURE: Assessing Systematic Generalization of CLEVR Models (2019) (40)Learning Tags that Vary Within a Song (2010) (39)Improving Speech Recognition by Revising Gated Recurrent Units (2017) (39)Introduction to the special issue on neural networks for data mining and knowledge discovery (2000) (39)Towards a Biologically Plausible Backprop (2016) (38)Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net (2017) (38)Bias in Estimating the Variance of K-Fold Cross-Validation (2005) (38)Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations (2018) (37)A network of deep neural networks for Distant Speech Recognition (2017) (37)11 Label Propagation and Quadratic Criterion (37)Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models (1993) (36)Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics (2019) (35)GSNs : Generative Stochastic Networks (2015) (35)CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning (2021) (35)On Adversarial Mixup Resynthesis (2019) (35)Task Loss Estimation for Sequence Prediction (2015) (34)Adversarial Domain Adaptation for Stable Brain-Machine Interfaces (2019) (34)Torchmeta: A Meta-Learning library for PyTorch (2019) (34)Bias learning, knowledge sharing (2003) (34)Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask (2018) (34)Early Inference in Energy-Based Models Approximates Back-Propagation (2015) (34)ReSeg: A Recurrent Neural Network for Object Segmentation (2015) (33)On the Expressive Power of Deep Architectures (2011) (33)Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization (2019) (33)DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation (2018) (33)Quadratic Features and Deep Architectures for Chunking (2009) (32)Mollifying Networks (2017) (32)Gradient Starvation: A Learning Proclivity in Neural Networks (2020) (32)Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes (2018) (32)Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs (2010) (31)Contextual tag inference (2011) (31)Learning Eigenfunctions of Similarity: Linking Spectral Clustering and Kernel PCA (2003) (31)Scaling Up Spike-and-Slab Models for Unsupervised Feature Learning (2013) (31)An EM Algorithm for Asynchronous Input/Output Hidden Markov Models (1996) (31)Evolving Culture Versus Local Minima (2014) (30)Quick Training of Probabilistic Neural Nets by Importance Sampling (2003) (30)Evolving Culture vs Local Minima (2012) (30)Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models (2004) (30)Batch-normalized joint training for DNN-based distant speech recognition (2016) (30)Adaptive Drift-Diffusion Process to Learn Time Intervals (2011) (30)Scaling Large Learning Problems with Hard Parallel Mixtures (2003) (29)Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines (2013) (29)Online continual learning with no task boundaries (2019) (29)Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models (2021) (29)Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions (2013) (29)Universal Successor Representations for Transfer Reinforcement Learning (2018) (28)Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies (2019) (28)SpeechBrain: A General-Purpose Speech Toolkit (2021) (27)On the interplay between noise and curvature and its effect on optimization and generalization (2020) (27)Probabilistic Planning with Sequential Monte Carlo methods (2019) (27)Joint Training of Deep Boltzmann Machines (2012) (27)Editorial introduction to the Neural Networks special issue on Deep Learning of Representations (2015) (27)Building Musically-relevant Audio Features through Multiple Timescale Representations (2012) (27)Towards Gene Expression Convolutions using Gene Interaction Graphs (2018) (26)Programmable execution of multi-layered networks for automatic speech recognition (1989) (26)Representation Mixing for TTS Synthesis (2019) (26)Fraternal Dropout (2018) (26)Global optimization of a neural network-hidden Markov model hybrid (1991) (26)Discriminative Non-negative Matrix Factorization for Multiple Pitch Estimation (2012) (26)On the Learning Dynamics of Deep Neural Networks (2018) (26)Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules (2020) (25)Statistical Learning Algorithms Applied to Automobile Insurance Ratemaking (2003) (25)On the search for new learning rules for ANNs (2005) (25)Inherent privacy limitations of decentralized contact tracing apps (2021) (25)Tractable Multivariate Binary Density Estimation and the Restricted Boltzmann Forest (2010) (25)Regularized Auto-Encoders Estimate Local Statistics (2013) (25)Speech and Speaker Recognition from Raw Waveform with SincNet (2018) (24)Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks (1992) (24)Small-GAN: Speeding Up GAN Training Using Core-sets (2020) (24)Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future (2019) (24)InfoMask: Masked Variational Latent Representation to Localize Chest Disease (2019) (24)The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions (2014) (23)Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning (2014) (23)Brain Inspired Reinforcement Learning (2004) (23)A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions (2009) (23)Finding Flatter Minima with SGD (2018) (23)Alternative time representation in dopamine models (2009) (22)Convolutional neural networks for mesh-based parcellation of the cerebral cortex (2018) (22)Word normalization for online handwritten word recognition (1994) (22)DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning (2020) (22)Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures (2019) (22)Gradient-based Learning Applied to Document Recognition Gt Graph Transformer. Gtn Graph Transformer Network. Hmm Hidden Markov Model. Hos Heuristic Oversegmentation. K-nn K-nearest Neighbor. Nn Neural Network. Ocr Optical Character Recognition. Pca Principal Component Analysis. Rbf Radial Basis Func (1998) (22)Hybrid Models for Learning to Branch (2020) (22)Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference (2001) (22)HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery (2020) (22)Learning from Partial Labels with Minimum Entropy (2004) (22)Joint Training Deep Boltzmann Machines for Classification (2013) (22)A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies (2019) (22)Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives (2020) (21)Manifold Mixup: Learning Better Representations by Interpolating Hidden States (2018) (21)Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems (2020) (21)Coordination Among Neural Modules Through a Shared Global Workspace (2021) (21)Discriminative feature and model design for automatic speech recognition (1997) (21)h-detach: Modifying the LSTM Gradient Towards Better Optimization (2019) (21)Bounding the Test Log-Likelihood of Generative Models (2014) (21)Efficient EM Training of Gaussian Mixtures with Missing Data (2012) (21)On Tracking The Partition Function (2011) (21)Bidirectional Helmholtz Machines (2016) (21)Bias learning, knowledge sharing (2000) (21)Commonsense mining as knowledge base completion? A study on the impact of novelty (2018) (20)The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach (2020) (20)Variational Temporal Abstraction (2019) (20)Support vector machines for improving the classification of brain PET images (1998) (20)Diffusion of Credit in Markovian Models (1994) (20)Phonetically-based multi-layered neural networks for vowel classification (1990) (20)Learning the 2-D Topology of Images (2007) (20)Focused Hierarchical RNNs for Conditional Sequence Processing (2018) (20)Generalization of Equilibrium Propagation to Vector Field Dynamics (2018) (19)On Training Recurrent Neural Networks for Lifelong Learning (2018) (19)Meta-learning framework with applications to zero-shot time-series forecasting (2021) (19)Topic Segmentation : A First Stage to Dialog-Based Information Extraction (2001) (19)Robust Regression with Asymmetric Heavy-Tail Noise Distributions (2002) (19)Interactive Language Learning by Question Answering (2019) (19)Locally Linear Embedding for dimensionality reduction in QSAR (2004) (19)MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation (2018) (18)Training End-to-End Analog Neural Networks with Equilibrium Propagation (2020) (18)Improving First and Second-Order Methods by Modeling Uncertainty (2010) (18)On Training Deep Boltzmann Machines (2012) (18)On the challenge of learning complex functions. (2007) (18)Systematic generalisation with group invariant predictions (2021) (18)On the Iterative Refinement of Densely Connected Representation Levels for Semantic Segmentation (2018) (18)An EM approach to grammatical inference: input/output HMMs (1994) (18)Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks (2019) (17)Natural Gradient Revisited (2013) (17)Browsing through high quality document images with DjVu (1998) (17)A robust adaptive stochastic gradient method for deep learning (2017) (17)Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information (2021) (17)Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge (1989) (17)Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs (2013) (17)How to Initialize your Network? Robust Initialization for WeightNorm & ResNets (2019) (16)Attention Based Pruning for Shift Networks (2021) (16)Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network (1990) (16)Equivalence of Equilibrium Propagation and Recurrent Backpropagation (2019) (16)RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs (2021) (16)Augmented Functional Time Series Representation and Forecasting with Gaussian Processes (2007) (16)Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning (2018) (16)Unsupervised Learning of Semantics of Object Detections for Scene Categorization (2013) (16)Autotagging music with conditional restricted Boltzmann machines (2011) (16)Deep Self-Taught Learning for Handwritten Character Recognition (2010) (16)A Neural Support Vector Network architecture with adaptive kernels (2000) (16)BigBrain 3D atlas of cortical layers: cortical and laminar thickness gradients diverge in sensory and motor cortices (2019) (15)Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input (2019) (15)Diet Networks: Thin Parameters for Fat Genomic (2016) (15)Use of neural networks for the recognition of place of articulation (1988) (15)Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing Its Gradient Estimator Bias (2021) (15)Continuous optimization of hyper-parameters (2000) (15)Generalizable Features From Unsupervised Learning (2017) (15)Variational Bi-LSTMs (2017) (14)Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization (2021) (14)Neural Production Systems (2021) (14)Learning from unexpected events in the neocortical microcircuit (2021) (14)Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible (2016) (14)Saliency is a Possible Red Herring When Diagnosing Poor Generalization (2021) (14)A hybrid coder for hidden Markov models using a recurrent neural networks (1990) (14)The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget (2020) (14)PROC OF THE IEEE NOVEMBER Gradient Based Learning Applied to Document Recognition (2006) (14)Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies (2021) (14)Deep Directed Generative Autoencoders (2014) (14)Learning Causal Models Online (2020) (14)Learning semantic representations of objects and their parts (2013) (14)Rethinking Distributional Matching Based Domain Adaptation (2020) (14)A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition (2019) (14)Object-Centric Image Generation from Layouts (2021) (14)Generalization of a Parametric Learning Rule (1993) (13)Scaling Large Learning Problems with Hard Parallel Mixtures (2002) (13)GraphMix: Improved Training of GNNs for Semi-Supervised Learning (2021) (13)COVI White Paper (2020) (13)DETONATION CLASSIFICATION FROM ACOUSTIC SIGNATURE WITH THE RESTRICTED BOLTZMANN MACHINE (2012) (13)The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All) (2011) (13)Target Propagation (2015) (13)HNHN: Hypergraph Networks with Hyperedge Neurons (2020) (13)The Benefits of Over-parameterization at Initialization in Deep ReLU Networks (2019) (13)An EM Approach to Learning Sequential (1994) (13)ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient (2014) (13)Learning invariant features through local space contraction (2011) (13)Information matrices and generalization (2019) (13)Perceptual Generative Autoencoders (2019) (13)An objective function for STDP (2015) (13)Twin Networks: Using the Future as a Regularizer (2017) (12)Twin Regularization for online speech recognition (2018) (12)Modeling the Long Term Future in Model-Based Reinforcement Learning (2019) (12)Input decay: simple and effective soft variable selection (2001) (12)Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks (2017) (12)Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences (2018) (11)Keep Drawing It: Iterative language-based image generation and editing (2018) (11)Learned-norm pooling for deep neural networks (2013) (11)The effects of negative adaptation in Model-Agnostic Meta-Learning (2018) (11)Locally Weighted Full Covariance Gaussian Density Estimation (2004) (11)Scaling up deep learning (2014) (11)Deriving Differential Target Propagation from Iterating Approximate Inverses (2020) (11)GradMask: Reduce Overfitting by Regularizing Saliency (2019) (11)DEUP: Direct Epistemic Uncertainty Prediction (2021) (11)Missing Data with Recurrent Networks Handling Asynchronous or Missing Data with Recurrent Networks (1998) (11)Towards Standardization of Data Licenses: The Montreal Data License (2019) (11)Conditioning and time representation in long short-term memory networks (2013) (11)Equilibrium Propagation with Continual Weight Updates (2020) (11)The First Conversational Intelligence Challenge (2018) (11)Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay (2020) (11)On the Morality of Artificial Intelligence (2019) (10)Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders (2012) (10)Plan, Attend, Generate: Planning for Sequence-to-Sequence Models (2017) (10)Extending the Framework of Equilibrium Propagation to General Dynamics (2018) (10)Multimodal Transitions for Generative Stochastic Networks (2014) (10)Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2018) (10)Compositional Generalization by Factorizing Alignment and Translation (2020) (10)Statistical Machine Learning Algorithms for Target Classification from Acoustic Signature (2009) (10)GibbsNet: Iterative Adversarial Inference for Deep Graphical Models (2017) (10)The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis (2019) (10)Transformers with Competitive Ensembles of Independent Mechanisms (2021) (10)Unsupervised and Transfer Learning under Uncertainty - From Object Detections to Scene Categorization (2013) (10)Generalization in Machine Learning via Analytical Learning Theory (2018) (9)Universal Successor Features for Transfer Reinforcement Learning (2020) (9)On Catastrophic Interference in Atari 2600 Games (2020) (9)Machines Who Learn. (2016) (9)A Deep Reinforcement Learning Chatbot (Short Version) (2018) (9)Automated segmentation of cortical layers in BigBrain reveals divergent cortical and laminar thickness gradients in sensory and motor cortices (2019) (9)Reinforcement Learning for Sustainable Agriculture (2019) (9)Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio (2018) (9)Image Segmentation by Iterative Inference from Conditional Score Estimation (2017) (9)Boundary Seeking GANs (2018) (8)Deep learning and cultural evolution (2014) (8)RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design (2020) (8)Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation (1991) (8)Reinforced Imitation in Heterogeneous Action Space (2019) (8)Oracle Performance for Visual Captioning (2016) (8)A simple and general method for semi-supervised learning (2010) (8)Dynamic Frame Skipping for Fast Speech Recognition in Recurrent Neural Network Based Acoustic Models (2018) (8)TRAINING A NEURAL NETWORK WITH A FINANCIAL CRITERION RATHER THAN A PREDICTION CRITERION (2007) (8)On the Equivalence between Deep NADE and Generative Stochastic Networks (2014) (8)How to construct deep recurrent neural networks: Proceedings of the Second International Conference on Learning Representations (ICLR 2014) (2014) (8)A3T: Adversarially Augmented Adversarial Training (2018) (8)The Octopus Approach to the Alexa Competition : A Deep Ensemble-based Socialbot (2017) (7)Deep Learning for Automatic Summary Scoring (2012) (7)S2RMs: Spatially Structured Recurrent Modules (2020) (7)How Transferable Are Features in Convolutional Neural Network Acoustic Models across Languages? (2019) (7)Modularity Matters: Learning Invariant Relational Reasoning Tasks (2018) (7)Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach (2020) (7)Weakly-supervised Knowledge Graph Alignment with Adversarial Learning (2019) (7)An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming (2021) (7)The Causal-Neural Connection: Expressiveness, Learnability, and Inference (2021) (7)Multiscale sequence modeling with a learned dictionary (2017) (7)Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning (2021) (7)An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism (2009) (7)How does hemispheric specialization contribute to human-defining cognition? (2021) (7)HighRes-net: Multi-Frame Super-Resolution by Recursive Fusion (2019) (7)Iteratively unveiling new regions of interest in Deep Learning models (2018) (7)Non-parametric Regression between Riemannian Manifolds (2009) (7)A Generative Process for Contractive Auto-Encoders (2012) (7)Découpage thématique des conversations : un outil d'aide à l'extraction (2002) (7)Discovering Shared Structure in Manifold Learning (2004) (7)A Dataset of Topic-Oriented Human-to-Chatbot Dialogues (2018) (6)Conditional Computation for Continual Learning (2019) (6)How can deep learning advance computational modeling of sensory information processing? (2018) (6)Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy (2019) (6)Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation (2021) (6)An Analysis of the Adaptation Speed of Causal Models (2021) (6)Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination (2004) (6)ACtuAL: Actor-Critic Under Adversarial Learning (2017) (6)Problems in the deployment of machine-learned models in health care (2021) (6)Large-Scale Algorithms (2006) (6)Variance Regularizing Adversarial Learning (2017) (6)BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization (2020) (6)Fast and Slow Learning of Recurrent Independent Mechanisms (2021) (6)Big Data: Theoretical Aspects [Scanning the Issue] (2016) (6)Suitability of V1 Energy Models for Object Classification (2011) (6)Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning (2017) (6)Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks (2020) (6)Discussion of "The Neural Autoregressive Distribution Estimator" (2011) (5)Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition (1988) (5)Binary pseudowavelets and applications to bilevel image processing (1999) (5)On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties (1989) (5)Deep Tempering (2014) (5)COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing (2020) (5)NYU-MILA Neural Machine Translation Systems for WMT’16 (2016) (5)Graph Neural Networks with Learnable Structural and Positional Representations (2021) (5)Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya (2020) (5)Use of Multi-Layered Networks for Coding Speech with Phonetic Features (1988) (5)Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings (2011) (5)On random weights for texture generation in one layer CNNS (2017) (5)On-line handwriting recognition with neural networks: Spatial representation versus temporal representation (1993) (5)Learning Powerful Policies by Using Consistent Dynamics Model (2019) (5)Supplementary material for : How transferable are features in deep neural networks ? (2014) (5)Generating Multiscale Amorphous Molecular Structures Using Deep Learning: A Study in 2D. (2020) (5)Equivalence of Equilibrium Propagation and Recurrent Backpropagation (2018) (5)A learning-based algorithm to quickly compute good primal solutions for Stochastic Integer Programs (2020) (5)Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (5)NU-GAN: High resolution neural upsampling with GAN (2020) (4)Underwhelming Generalization Improvements From Controlling Feature Attribution (2019) (4)Multi-Task Learning For Option Pricing (2002) (4)Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks (2020) (4)Trainable performance upper bounds for image and video captioning (2015) (4)Use of multilayer networks for the recognition of phonetic features and phonemes (1989) (4)Establishing an Evaluation Metric to Quantify Climate Change Image Realism (2020) (4)Empirical performance upper bounds for image and video captioning (2015) (4)Shared Context Probabilistic Transducers (1997) (4)Probabilistic neural network models for sequential data (2000) (4)Untangling tradeoffs between recurrence and self-attention in neural networks (2020) (4)On the Morality of Artificial Intelligence [Commentary] (2020) (4)Understanding deep architectures and the effect of unsupervised pre-training (2011) (4)Modeling Natural Image Covariance with a Spike and Slab Restricted Boltzmann Machine (2010) (4)Discrete-Valued Neural Communication (2021) (4)Combating False Negatives in Adversarial Imitation Learning (2021) (3)Reinforced Imitation Learning from Observations (2018) (3)State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations (2019) (3)Forecasting and Trading Commodity Contract Spreads with Gaussian Processes (2007) (3)On Out-of-Sample Statistics for Time-Series (2002) (3)Predicting Infectiousness for Proactive Contact Tracing (2021) (3)Learning the Arrow of Time (2019) (3)Visual Concept Reasoning Networks (2021) (3)The Variational Walkback Algorithm (2016) (3)Systematicity in a Recurrent Neural Network by Factorizing Syntax and Semantics (2020) (3)Ghost Units Yield Biologically Plausible Backprop in Deep Neural Networks (2019) (3)Connectionist Models and their Application to Automatic Speech Recognition (1991) (3)Deep Architectures for Baby AI (2007) (3)Training Bidirectional Helmholtz Machines (2015) (3)Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization (2007) (3)Unsupervised one-to-many image translation (2018) (3)A Hybrid Pareto Model for Asymmetric Fat-Tail Data (2006) (3)Blocks and Fuel (2015) (3)Speech coding with multilayer networks (1989) (3)Valorisation d'Options par Optimisation du Sharpe Ratio (2002) (3)Structured Sparsity Inducing Adaptive Optimizers for Deep Learning (2021) (3)Learning the Arrow of Time for Problems in Reinforcement Learning (2020) (3)Convergence Properties of the K-means Algorithms L Eon Bottou (1995) (3)Graph-Based Semi-Supervised Learning (2005) (3)A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM (2020) (3)Learning from Learning Machines: Optimisation, Rules, and Social Norms (2020) (3)Learning to rank for censored survival data (2018) (2)Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers (2021) (2)A Walk with SGD: How SGD Explores Regions of Deep Network Loss? (2018) (2)hBERT + BiasCorp - Fighting Racism on the Web (2021) (2)Task Loss Estimation for Structured Prediction (2016) (2)An Actor-Critic Algorithm for Structured Prediction (2016) (2)Document Analysis with Transducers (2015) (2)On Out-of-Sample Statistics for Financial Time-Series (2002) (2)Low-memory convolutional neural networks through incremental depth-first processing (2018) (2)Radial Basis Functions for Speech Recognition (1992) (2)The Journey is the Reward: Unsupervised Learning of Influential Trajectories (2019) (2)Training opposing directed models using geometric mean matching (2015) (2)GFlowNet Foundations (2021) (2)Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code (2017) (2)Extracting Hidden Sense Probabilities from Bitexts (2003) (2)Workshop summary: Workshop on learning feature hierarchies (2009) (2)Learning Simple Non Stationarities with Hyper Parameters (1999) (2)From STDP towards Biologically Plausible Deep Learning (2015) (2)Applying Knowledge Transfer for Water Body Segmentation in Peru (2019) (2)Estimators of Variance for K-Fold Cross-Validation (2003) (2)Mastering Rate based Curriculum Learning (2020) (2)Distributed Representation Prediction for Generalization to New Words (2006) (2)Using Artificial Intelligence to Visualize the Impacts of Climate Change (2021) (2)Generalizing to a zero-data task : a computational chemistry case study (2006) (2)Deep Learning. Das umfassende Handbuch (2018) (2)Combining Model-based and Model-free RL via Multi-step Control Variates (2018) (2)Sparse Attentive Backtracking : Towards Efficient Credit Assignment In Recurrent Networks (2017) (2)MEMORY-EFFICIENT ADAPTIVE HUFFMAN CODING (1998) (2)SGD Smooths The Sharpest Directions (2018) (2)Variational Causal Networks: Approximate Bayesian Inference over Causal Structures (2021) (2)Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments (2021) (2)USE OF NEURAL NETWORKS FOR THE RECOGNITION OF PLACE (1988) (2)Pattern Recognition (2019) (2)Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition (2020) (2)Régularisation du prix des options : Stacking (2002) (2)A Common GPU n-Dimensional Array for Python and C (2011) (2)On Random Weights for Texture Generation in One Layer Neural Networks (2016) (2)A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (2021) (2)Introduction to NIPS 2017 Competition Track (2018) (1)FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters (1)Avoidance Learning Using Observational Reinforcement Learning (2019) (1)Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks (2020) (1)Cross-Modal Information Maximization for Medical Imaging: CMIM (2020) (1)J un 2 01 3 Deep Learning of Representations : Looking Forward (2013) (1)Convergence Properties of Deep Neural Networks on Separable Data (2018) (1)Approche statistique pour le repérage de mots informatifs dans les textes oraux (2004) (1)EVALUATING LONG-TERM DEPENDENCYBENCHMARK PROBLEMS BY RANDOM GUESSINGJ (2001) (1)Untangling tradeoffs between recurrence and self-attention in artificial neural networks (2020) (1)Apprentissage machine efficace: theorie et pratique (2012) (1)Exploration-Driven Representation Learning in Reinforcement Learning (2021) (1)Extended Semantic Tagging for Entity Extraction (1)Statistical Language and Speech Processing (2013) (1)Automated curriculum generation for Policy Gradients from Demonstrations (2019) (1)A Neural Network to Detect Homologies in Proteins (1989) (1)Learning Neural Causal Models with Active Interventions (2021) (1)Étude du biais dans le prix des options (2002) (1)Markovian Models for Sequential (2004) (1)Using Simulated Data to Generate Images of Climate Change (2020) (1)Spatially Structured Recurrent Modules (2021) (1)Stochastic Learning of Strategic Equilibria for Auctions (1999) (1)Generative Flow Networks for Discrete Probabilistic Modeling (2022) (1)Deep Learning for NLP (without Magic) References (2012) (1)Towards the Latent Transcriptome (2018) (1)Incorporating complex cells into neural networks for pattern classification (2011) (1)Mode Regularized Generative Adversarial (2016) (1)Learning Latent Multiscale Structure Using Recurrent Neural Networks (2016) (1)Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning (2021) (1)Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models (2021) (1)Trajectory Balance: Improved Credit Assignment in GFlowNets (2022) (1)Segmentation en thèmes de conversations téléphoniques : traitement en amont pour l’extraction d’information (2002) (1)Chunked Autoregressive GAN for Conditional Waveform Synthesis (2021) (1)Codon arrangement modulates MHC-I peptides presentation (2020) (1)InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck (2019) (1)Sharp Minima Can Generalize For Deep Nets Supplementary Material (2017) (1)Predicting ice flow using machine learning (2019) (1)Dynamic Inference with Neural Interpreters (2021) (1)Metric-based model selection for time-series forecasting (2002) (1)Comment améliorer la capacité de généralisation des algorithmes d'apprentissage pour la prise de décisions financières (2003) (1)COVI White Paper-Version 1.1 (2020) (1)Predicting Unreliable Predictions by Shattering a Neural Network (2021) (1)Gaussian Mixtures with Missing Data: an Ecien t EM Training Algorithm (1994) (1)Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007 (2008) (1)Gaussian Mixture Densities for Classification of Nuclear Power Plant Data (1998) (1)Unifying Likelihood-free Inference with Black-box Optimization and Beyond (2021) (0)Speech coding with multi-layer networks (1989) (0)Comparative Study of Learning Outcomes for Online Learning Platforms (2021) (0)Deep Learning (2021) (0)Estimating Car Insuran e Premia : a Case Study in High-Dimensional (2013) (0)Learning powerful policies and better dynamics models by encouraging consistency (2018) (0)Model Sele tion for Small Sample (2000) (0)ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods (2021) (0)Compositional Attention: Disentangling Search and Retrieval (2021) (0)Graph Priors for Deep Neural Networks (2018) (0)CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation (2020) (0)SPECTRA: Sparse Entity-centric Transitions (2019) (0)Episodes Meta Sequence S 2 Fast Update Slow Update Fast Update Slow Update (2021) (0)How do We Train Deep Architectures ? (2009) (0)Deep Meditations : Controlled navigation of latent space (2018) (0)Generalization to a zero-data task: an empirical study (0)Proceedings of the 21st International Conference on Neural Information Processing Systems (2008) (0)Generalization (2020) (0)An Energy-Based Recurrent Neural Network for Multiple Fundamental Frequency Estimation (2011) (0)Supplemental Material for : Deep Generative Stochastic Networks Trainable by Backprop (2014) (0)Problèmes associés au déploiement des modèles fondés sur l’apprentissage machine en santé (2021) (0)Proceedings of the 22nd International Conference on Neural Information Processing Systems (2009) (0)The Challenge of Non-Linear Regression on Large Datasets with Asymmetric Heavy Tails (2002) (0)Repérage de mots informatifs dans les textes conversationnels (2004) (0)Joint Learning of Generative Translator and Classifier for Visually Similar Classes (2020) (0)»Deep Learning ist keine Religion« (2018) (0)Estimation de densité conditionnelle lorsque l'hypothèse de normalité est insatisfaisante (2004) (0)On summarized validation curves and generalization (2019) (0)L EARNING THE A RROW OF T IME FOR P ROBLEMS IN R EINFORCEMENT L EARNING (2020) (0)18 Large-Scale Algorithms (0)M L ] 2 0 A ug 2 01 3 Pylearn 2 : a machine learning research library (2014) (0)Multi-scale Feature Learning Dynamics: Insights for Double Descent (2021) (0)Part I Feature Extraction Fundamentals 11 Ensembles of Regularized Least Squares Classifiers for High-dimensional Problems 15 Tree-based Ensembles with Dynamic Soft Feature Selection 18 Bayesian Support Vector Machines for Feature Ranking and Selection 21 Feature Selection via Sensitivity Analysis w (0)Forecasting Non-Stationary Volatility with Hyper-Parameters (2002) (0)Stacked calibration of off-policy policy evaluation for video game matchmaking (2013) (0)Proposed Algorithm : Algorithm (2007) (0)A Hybrid Pareto Model for Conditional Density Estimation of Asymmetric Fat-Tail Data (2007) (0)Les données au service du savoir (2017) (0)L G ] 2 9 D ec 2 01 8 Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2019) (0)Extending Metric-Based Model Selection and Regularization in the Absence of Unlabeled Data (0)BigBrain: 1D convolutional neural networks for automated sementation of cortical layers (2018) (0)SGD S MOOTHS THE S HARPEST D IRECTIONS (2018) (0)Learning Neural Generative Dynamics for Molecular Conformation Generation (2021) (0)Towards more hardware-friendly deep learning (2017) (0)The K Best-Paths Approach to Approximate Dynamic Programming with Application to Portfolio Optimization (2006) (0)WARDS BETTER OPTIMIZATION (2019) (0)Marathi Handwritten Numeral Recognition using Zernike Moments and Fourier Descriptors (2020) (0)2 The Curse of Dimensionality for Classical Non-Parametric Models (0)IGURE QA : A N A NNOTATED F IGURE D ATASET FOR V ISUAL R EASONING (2018) (0)Image-to-image Mapping with Many Domains by Sparse Attribute Transfer (2020) (0)F IT N ETS : H INTS FOR T HIN D EEP N ETS (2015) (0)The AI Driving Olympics at NIPS 2018 (0)Combating False Negatives in Adversarial Imitation Learning (Student Abstract) (2020) (0)Learning Generative Models with Locally Disentangled Latent Factors (2018) (0)Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation (2020) (0)LATTER M INIMA WITH SGD (2018) (0)Establishing an evaluation metric to quantify climate change image realism (2020) (0)On the Use of an Ear Model and Multi-Layered Networks for Automatic Speech Recognition (1990) (0)Artificial Intelligence Cytometer in Blood (2019) (0)Recurrent Neural Networks for Adaptive Temporal ProcessingYoshua Bengio (1993) (0)On learning distributed representations of semantics (2011) (0)CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation (2021) (0)Stochastic Gradient Descent on a Portfolio Management Training Criterion Using the IPA Gradient Estimator (2003) (0)UOUS AND DISCRETE ADDRESSING SCHEMES (2016) (0)Exploring the Wasserstein metric for survival analysis (2021) (0)»Deep Learning ist keine Religion« (2018) (0)Artificial Intelligence Pioneers But making those quantum leaps from science fiction to reality required hard work from computer scientists like (0)BabyAI 1.1 (2020) (0)Aprendizaje profundo. Tras años de decepciones, la inteligencia artiñcial está empezando a cumplir lo que prometia en sus comienzos gracias a esta potente técnica (2016) (0)Learning of Sophisticated Curriculums by viewing them as Graphs over Tasks (2018) (0)Collaborative filtering techniques for drug discovery par 7 M / t ( 3 ’ / 7 (2016) (0)Université de Montréal Balancing Signals for Semi-Supervised Sequence Learning (2020) (0)Multi-Domain Balanced Sampling Improves Out-of- Generalization of Chest X-ray Pathology Prediction Models (2021) (0)Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning (2020) (0)EnGAN: Latent Space MCMC and Maximum Entropy Generators for Energy-based Models (2018) (0)On the Optimization of a Synaptic LearningRuleSamy (1997) (0)Meta Attention Networks: Meta Learning Attention To Modulate Information Between Sparsely Interacting Recurrent Modules (2020) (0)A comparative study on hybrid acoustic phonetic decoders based on artificial neural networks (1991) (0)Pen-based visitor registration system (PENGUIN) (1994) (0)Proposed Architectural and Representational Modifications (2021) (0)Depthwith nonlinearity creates no bad localminima in ResNets (2019) (0)TRANSFER REINFORCEMENT LEARNING (2018) (0)GraphMix: Improved Training of Graph Neural Networks for Semi-Supervised Learning (2020) (0)SCANNING THE ISSUE Big Data : Theoretical Aspects (2015) (0)Former NASA chief unveils $ 100 million neural chip maker KnuEdge (2016) (0)IAPR keynote lecture IV: Deep learning (2015) (0)Markovian Models for Sequential DataYoshua (1996) (0)Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond (2021) (0)Optimization of Artificial Neural Network Hyperparameters For Processing Retrospective Information (2021) (0)From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence (2021) (0)Information Fusion in Deep Convolutional Neural Networks for Biomedical Image Segmentation 1 (2018) (0)Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs (2020) (0)CMIM: Cross-Modal Information Maximization For Medical Imaging (2021) (0)O ct 2 01 9 S MALL-GAN : S PEEDING UP GAN T RAINING USING C ORES ETS (2019) (0)CACHE (Critical Assessment of Computational Hit-finding Experiments): A public-private partnership benchmarking initiative to enable the development of computational methods for hit-finding (2021) (0)CACHE (Critical Assessment of Computational Hit-finding Experiments): A public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding (2022) (0)RNNLOGIC: LEARNING LOGIC RULES FOR REASON- (2020) (0)The Effect of Diversity in Meta-Learning (2022) (0)Pruning for efficient hardware implementations of deep neural networks (2020) (0)Université de Montréal Estimating the probability of a fleet vehicle accident: A deep learning approach using Conditional Variational Auto-Encoders (2020) (0)Towards Scaling Difference Target Propagation by Learning Backprop Targets (2022) (0)CACHE (Critical Assessment of Computational Hit-finding Experiments): A public-private partnership benchmarking initiative to enable the development of computational methods for hit-finding (2021) (0){COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (0)Artificial Intelligence Based Cloud Distributor (AI-CD): Probing Low Cloud Distribution with Generative Adversarial Neural Networks (2019) (0)Generalization of a Parametric LearningRule (1993) (0)RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro (2022) (0)Interpolation consistency training for semi-supervised learning (2022) (0)Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization (2022) (0)SUPPLEMENTARY MATERIAL-LEARNING TO NAVIGATE THE SYNTHETICALLY ACCESSIBLE CHEMICAL SPACE USING REINFORCEMENT LEARNING (2020) (0)Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features (2021) (0)Rethinking Learning Dynamics in RL using Adversarial Networks (2022) (0)A Two-Stream Continual Learning System With Variational Domain-Agnostic Feature Replay. (2021) (0)Reassuring and Troubling Views on Graph-Based Semi-Supervised Learning (2005) (0)MIREX TAGGING CONTEST : A DEEP NEURAL NET APPROACH ( DRAFT ) (2008) (0)Learning Classical Planning Transition Functions by Deep Neural Networks (2020) (0)Machine Learning (2021) (0)More Papers This paper list is powered by the following services:
Other Resources About Yoshua Bengio What Schools Are Affiliated With Yoshua Bengio? Yoshua Bengio is affiliated with the following schools:
Yoshua Bengio's AcademicInfluence.com Rankings Related Articles to Yoshua Bengio Controversial Topics Articles Related to Yoshua Bengio Online Degrees Articles Related to Yoshua Bengio Online Education Articles Related to Yoshua Bengio Degrees Articles Related to Yoshua Bengio Computer Science Articles Related to Yoshua Bengio Best colleges and universities for Computer Science Articles Related to Yoshua Bengio Engineering Articles Related to Yoshua Bengio Biology Articles Related to Yoshua Bengio People Articles Related to Yoshua Bengio College Life Articles Related to Yoshua Bengio Study Guides Articles Related to Yoshua Bengio