Yoshua Bengio

Q: What Schools Are Affiliated With Yoshua Bengio

Yoshua Bengio is affiliated with the following schools: Université de Montréal, McGill University, Massachusetts Institute of Technology

Yoshua Bengio's AcademicInfluence.com Rankings

Yoshua Bengio

Computer Science

#43

World Rank

#45

Historical Rank

Machine Learning

World Rank

Historical Rank

Algorithms

World Rank

Historical Rank

Database

World Rank

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Yoshua Bengio's Degrees

PhD Computer Science McGill University

Similar Degrees You Can Earn

Best Online PhD of Computer Science (Doctorates) 2025

Why Is Yoshua Bengio Influential?

(Suggest an Edit or Addition)

According to Wikipedia, Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms .

(See a Problem?)

Yoshua Bengio's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Deep Learning (2015) (62354)
Gradient-based learning applied to document recognition (1998) (41173)
Generative Adversarial Nets (2014) (36807)
Neural Machine Translation by Jointly Learning to Align and Translate (2014) (22615)
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (2014) (18074)
Understanding the difficulty of training deep feedforward neural networks (2010) (14588)
Representation Learning: A Review and New Perspectives (2012) (10080)
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014) (9179)
Graph Attention Networks (2017) (8666)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (2015) (8487)
Learning Deep Architectures for AI (2007) (8120)
Deep Sparse Rectifier Neural Networks (2011) (6968)
Learning long-term dependencies with gradient descent is difficult (1994) (6956)
How transferable are features in deep neural networks? (2014) (6778)
Random Search for Hyper-Parameter Optimization (2012) (6765)
A Neural Probabilistic Language Model (2003) (6725)
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion (2010) (6316)
Extracting and composing robust features with denoising autoencoders (2008) (6262)
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches (2014) (5181)
Convolutional networks for images, speech, and time series (1998) (4806)
On the difficulty of training recurrent neural networks (2012) (4412)
Pattern Recognition and Neural Networks (1995) (4258)
Greedy Layer-Wise Training of Deep Networks (2006) (4138)
Curriculum learning (2009) (3917)
Algorithms for Hyper-Parameter Optimization (2011) (3074)
FitNets: Hints for Thin Deep Nets (2014) (2595)
BinaryConnect: Training Deep Neural Networks with binary weights during propagations (2015) (2445)
Brain tumor segmentation with Deep Neural Networks (2015) (2344)
Word Representations: A Simple and General Method for Semi-Supervised Learning (2010) (2275)
Theano: A Python framework for fast computation of mathematical expressions (2016) (2219)
Attention-Based Models for Speech Recognition (2015) (2189)
Why Does Unsupervised Pre-training Help Deep Learning? (2010) (2106)
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (2022)
Maxout Networks (2013) (1961)
Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) (1961)
Practical Recommendations for Gradient-Based Training of Deep Architectures (2012) (1886)
Learning deep representations by mutual information estimation and maximization (2018) (1801)
A Structured Self-attentive Sentence Embedding (2017) (1761)
Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies (2001) (1748)
Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach (2011) (1699)
Semi-supervised Learning by Entropy Minimization (2004) (1621)
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models (2015) (1590)
NICE: Non-linear Independent Components Estimation (2014) (1540)
Binarized Neural Networks (2016) (1539)
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations (2016) (1502)
Theano: new features and speed improvements (2012) (1404)
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation (2016) (1385)
Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (2011) (1375)
Deep Learning of Representations for Unsupervised and Transfer Learning (2011) (1252)
Scaling learning algorithms towards AI (2007) (1225)
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization (2014) (1199)
Deep Graph Infomax (2018) (1186)
A Closer Look at Memorization in Deep Networks (2017) (1157)
Visualizing Higher-Layer Features of a Deep Network (2009) (1127)
Exploring Strategies for Training Deep Neural Networks (2009) (1115)
Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering (2003) (1103)
An empirical evaluation of deep architectures on problems with many factors of variation (2007) (1078)
On the Number of Linear Regions of Deep Neural Networks (2014) (1059)
Challenges in representation learning: A report on three machine learning contests (2013) (1050)
A Recurrent Latent Variable Model for Sequential Data (2015) (1023)
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (2018) (1002)
End-to-end attention-based large vocabulary speech recognition (2015) (997)
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues (2016) (991)
Hierarchical Probabilistic Neural Network Language Model (2005) (990)
On Using Very Large Target Vocabulary for Neural Machine Translation (2014) (927)
An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks (2013) (922)
Inference for the Generalization Error (1999) (918)
How to Construct Deep Recurrent Neural Networks (2013) (885)
Generative adversarial networks (2020) (881)
No Unbiased Estimator of the Variance of K-Fold Cross-Validation (2003) (880)
Object Recognition with Gradient-Based Learning (1999) (854)
Classification using discriminative restricted Boltzmann machines (2008) (849)
Theano: A CPU and GPU Math Compiler in Python (2010) (847)
Learning Structured Embeddings of Knowledge Bases (2011) (828)
Mutual Information Neural Estimation (2018) (747)
Representational Power of Restricted Boltzmann Machines and Deep Belief Networks (2008) (742)
Manifold Mixup: Better Representations by Interpolating Hidden States (2018) (735)
Gated Feedback Recurrent Neural Networks (2015) (729)
Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon (2018) (682)
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription (2012) (659)
BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 (2016) (608)
Deep Learning of Representations: Looking Forward (2013) (604)
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (2016) (595)
Unitary Evolution Recurrent Neural Networks (2015) (589)
A semantic matching energy function for learning with multi-relational data (2013) (582)
Neural Probabilistic Language Models (2006) (573)
Sharp Minima Can Generalize For Deep Nets (2017) (566)
On the Spectral Bias of Neural Networks (2018) (562)
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis (2019) (561)
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism (2016) (561)
An Actor-Critic Algorithm for Sequence Prediction (2016) (534)
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding (2015) (530)
Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives (2012) (514)
Deep Complex Networks (2017) (510)
A deep learning framework for neuroscience (2019) (509)
Understanding the exploding gradient problem (2012) (509)
Training deep neural networks with low precision multiplications (2014) (509)
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (2016) (507)
Convergence Properties of the K-Means Algorithms (1994) (502)
Toward Causal Representation Learning (2021) (501)
On Using Monolingual Corpora in Neural Machine Translation (2015) (494)
Hierarchical Multiscale Recurrent Neural Networks (2016) (490)
Pointing the Unknown Words (2016) (488)
Understanding intermediate layers using linear classifier probes (2016) (487)
Advances in optimizing recurrent networks (2012) (485)
A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion (2015) (473)
Benchmarking Graph Neural Networks (2020) (468)
Speaker Recognition from Raw Waveform with SincNet (2018) (468)
Gradient-Based Optimization of Hyperparameters (2000) (467)
Mode Regularized Generative Adversarial Networks (2016) (465)
Professor Forcing: A New Algorithm for Training Recurrent Networks (2016) (463)
Generalized Denoising Auto-Encoders as Generative Models (2013) (446)
Interpolation Consistency Training for Semi-Supervised Learning (2019) (438)
What regularized auto-encoders learn from the data-generating distribution (2012) (437)
The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training (2009) (427)
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results (2014) (426)
A Parallel Mixture of SVMs for Very Large Scale Problems (2001) (424)
Zero-data Learning of New Tasks (2008) (420)
Tackling Climate Change with Machine Learning (2019) (406)
Char2Wav: End-to-End Speech Synthesis (2017) (402)
Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding (2013) (397)
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments (2014) (389)
Bayesian Model-Agnostic Meta-Learning (2018) (388)
Deep Generative Stochastic Networks Trainable by Backprop (2013) (380)
An Input Output HMM Architecture (1994) (372)
Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks (2015) (367)
EmoNets: Multimodal deep learning approaches for emotion recognition in video (2015) (365)
Gradient based sample selection for online continual learning (2019) (363)
N-BEATS: Neural basis expansion analysis for interpretable time series forecasting (2019) (362)
Hierarchical Recurrent Neural Networks for Long-Term Dependencies (1995) (361)
Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing (2012) (357)
Three Factors Influencing Minima in SGD (2017) (355)
Incorporating Second-Order Functional Knowledge for Better Option Pricing (2000) (350)
Generalization in Deep Learning (2017) (346)
Input-output HMMs for sequence processing (1996) (346)
Combining modality specific deep neural networks for emotion recognition in video (2013) (345)
Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks (2016) (329)
Kernel Matching Pursuit (2002) (324)
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation (2016) (318)
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA (2004) (316)
Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation (2016) (314)
Revisiting Natural Gradient for Deep Networks (2013) (314)
An Empirical Study of Example Forgetting during Deep Neural Network Learning (2018) (312)
Shallow vs. Deep Sum-Product Networks (2011) (312)
Neural Networks with Few Multiplications (2015) (309)
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses (2017) (308)
Pylearn2: a machine learning research library (2013) (306)
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning (2018) (303)
Better Mixing via Deep Representations (2012) (298)
Learning Algorithms for the Classification Restricted Boltzmann Machine (2012) (297)
Markovian Models for Sequential Data (2004) (293)
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations (2016) (293)
MetaGAN: An Adversarial Approach to Few-Shot Learning (2018) (293)
Learning a synaptic learning rule (1991) (292)
Boosting Neural Networks (2000) (289)
Towards Biologically Plausible Deep Learning (2015) (289)
High quality document image compression with "DjVu" (1998) (282)
Global optimization of a neural network-hidden Markov model hybrid (1991) (279)
Difference Target Propagation (2014) (276)
On the Expressive Power of Deep Architectures (2011) (275)
Equilibrated adaptive learning rates for non-convex optimization (2015) (270)
The Manifold Tangent Classifier (2011) (268)
Drawing and Recognizing Chinese Characters with Recurrent Neural Network (2016) (265)
A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms (2019) (259)
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus (2016) (253)
On the Optimization of a Synaptic Learning Rule (2007) (253)
ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks (2015) (252)
RMSProp and equilibrated adaptive learning rates for non-convex optimization. (2015) (250)
Theano: Deep Learning on GPUs with Python (2012) (250)
Higher Order Contractive Auto-Encoder (2011) (249)
SpeechBrain: A General-Purpose Speech Toolkit (2021) (248)
Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark (2016) (237)
Learning deep physiological models of affect (2013) (233)
Noisy Activation Functions (2016) (232)
Recurrent Independent Mechanisms (2019) (231)
GMNN: Graph Markov Neural Networks (2019) (227)
Unsupervised and Transfer Learning Challenge: a Deep Learning Approach (2011) (227)
K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms (2001) (223)
Deep learning for AI (2021) (220)
Justifying and Generalizing Contrastive Divergence (2009) (217)
The problem of learning long-term dependencies in recurrent networks (1993) (216)
Speech Model Pre-training for End-to-End Spoken Language Understanding (2019) (216)
Dendritic cortical microcircuits approximate the backpropagation algorithm (2018) (214)
Measuring the tendency of CNNs to Learn Surface Statistical Regularities (2017) (214)
ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation (2015) (213)
Light Gated Recurrent Units for Speech Recognition (2018) (212)
Experience Grounds Language (2020) (210)
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model (2008) (209)
Maximum-Likelihood Augmented Discrete Generative Adversarial Networks (2017) (209)
The Curse of Highly Variable Functions for Local Kernel Machines (2005) (207)
Efficient Non-Parametric Function Induction in Semi-Supervised Learning (2004) (206)
On the number of response regions of deep feed forward networks with piece-wise linear activations (2013) (206)
A Deep Reinforcement Learning Chatbot (2017) (200)
Unsupervised State Representation Learning in Atari (2019) (199)
Disentangling Factors of Variation for Facial Expression Recognition (2012) (197)
Image-to-image translation for cross-domain disentanglement (2018) (194)
Learning normalized inputs for iterative estimation in medical image segmentation (2017) (194)
Topmoumoute Online Natural Gradient Algorithm (2007) (193)
Batch normalized recurrent neural networks (2015) (191)
Multi-Task Self-Supervised Learning for Robust Speech Recognition (2020) (190)
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation (2016) (185)
Convex Neural Networks (2005) (183)
Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks (2019) (182)
Blocks and Fuel: Frameworks for deep learning (2015) (179)
The Pytorch-kaldi Speech Recognition Toolkit (2018) (178)
Improving Generative Adversarial Networks with Denoising Feature Matching (2016) (177)
Hierarchical Neural Network Generative Models for Movie Dialogues (2015) (177)
Predicting COVID-19 Pneumonia Severity on Chest X-ray With Deep Learning (2020) (176)
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims (2020) (175)
Towards End-to-end Spoken Language Understanding (2018) (175)
The Consciousness Prior (2017) (173)
Low precision arithmetic for deep learning (2014) (172)
Artificial Neural Networks Applied to Taxi Destination Prediction (2015) (168)
Z-Forcing: Training Stochastic Recurrent Networks (2017) (162)
Deep Belief Networks Are Compact Universal Approximators (2010) (162)
LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition (1995) (161)
HeMIS: Hetero-Modal Image Segmentation (2016) (160)
Deep Learning for NLP (without Magic) (2012) (158)
Learning to Understand Phrases by Embedding the Dictionary (2015) (157)
Audio Chord Recognition with Recurrent Neural Networks (2013) (153)
Knowledge Matters: Importance of Prior Information for Optimization (2013) (152)
How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation (2014) (151)
Variance Reduction in SGD by Distributed Importance Sampling (2015) (150)
Neural networks for speech and sequence recognition (1996) (149)
Manifold Parzen Windows (2002) (147)
Denoising Criterion for Variational Auto-Encoding Framework (2015) (147)
Architectural Complexity Measures of Recurrent Neural Networks (2016) (146)
Boundary-Seeking Generative Adversarial Networks (2017) (144)
Montreal Neural Machine Translation Systems for WMT’15 (2015) (144)
Reweighted Wake-Sleep (2014) (142)
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks (2013) (140)
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning (2018) (139)
FigureQA: An Annotated Figure Dataset for Visual Reasoning (2017) (139)
Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews (2014) (139)
Inductive biases for deep learning of higher-level cognition (2020) (137)
On Multiplicative Integration with Recurrent Neural Networks (2016) (136)
Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks (1999) (135)
Fine-grained attention mechanism for neural machine translation (2018) (135)
Multi-Prediction Deep Boltzmann Machines (2013) (134)
InfoBot: Transfer and Exploration via the Information Bottleneck (2019) (132)
Deep Learners Benefit More from Out-of-Distribution Examples (2011) (128)
Deep Directed Generative Models with Energy-Based Probability Estimation (2016) (126)
Count-ception: Counting by Fully Convolutional Redundant Counting (2017) (126)
Marginalized Denoising Auto-encoders for Nonlinear Representations (2014) (125)
Mining (2011) (123)
BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop (2018) (122)
Learning Neural Causal Models from Unknown Interventions (2019) (122)
Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines (2010) (119)
A Neural Knowledge Language Model (2016) (117)
Global training of document processing systems using graph transformer networks (1997) (117)
Model Selection for Small Sample Regression (2002) (116)
Iterative Alternating Neural Attention for Machine Reading (2016) (114)
Gated Orthogonal Recurrent Units: On Learning to Forget (2017) (113)
Neural net language models (2008) (113)
Revisiting Fundamentals of Experience Replay (2020) (109)
Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio (2011) (108)
Understanding Representations Learned in Deep Architectures (2010) (105)
Gradient Starvation: A Learning Proclivity in Neural Networks (2020) (103)
On integrating a language model into neural machine translation (2017) (98)
Taking on the curse of dimensionality in joint distributions using neural networks (2000) (97)
Non-Local Manifold Tangent Learning (2004) (97)
An empirical analysis of dropout in piecewise linear networks (2013) (97)
Deep Learning for Patient-Specific Kidney Graft Survival Analysis (2017) (97)
On the saddle point problem for non-convex optimization (2014) (97)
Collaborative Filtering on a Family of Biological Targets (2006) (96)
The need for privacy with public digital contact tracing during the COVID-19 pandemic (2020) (96)
Large-Scale Feature Learning With Spike-and-Slab Sparse Coding (2012) (96)
Residual Connections Encourage Iterative Inference (2017) (96)
Deconstructing the Ladder Network Architecture (2015) (95)
Feature-wise transformations (2018) (95)
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study (2019) (94)
Modeling term dependencies with quantum language models for IR (2013) (93)
A Spike and Slab Restricted Boltzmann Machine (2011) (93)
End-to-End Online Writer Identification With Recurrent Neural Network (2017) (92)
Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models (1993) (92)
Recurrent Neural Networks for Missing or Asynchronous Data (1995) (91)
A hybrid Pareto model for asymmetric fat-tailed data: the univariate case (2009) (90)
Maximum Entropy Generators for Energy-Based Models (2019) (90)
Multi-Task Learning for Stock Selection (1996) (89)
Disentangling Factors of Variation via Generative Entangling (2012) (89)
The Curse of Dimensionality for Local Kernel Machines (2005) (88)
BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices (2019) (87)
Quaternion Recurrent Neural Networks (2018) (86)
On the number of inference regions of deep feed forward networks with piece-wise linear activations (2013) (86)
ObamaNet: Photo-realistic lip-sync from text (2017) (85)
Multi-way, multilingual neural machine translation (2017) (85)
A Walk with SGD (2018) (84)
Estimating or Propagating Gradients Through Stochastic Neurons (2013) (84)
Quickly Generating Representative Samples from an RBM-Derived Process (2011) (84)
ChatPainter: Improving Text to Image Generation using Dialogue (2018) (83)
BPS: a learning algorithm for capturing the dynamic nature of speech (1989) (83)
Context-dependent word representation for neural machine translation (2016) (82)
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction (2018) (82)
Unsupervised Models of Images by Spikeand-Slab RBMs (2011) (82)
Wasserstein Dependency Measure for Representation Learning (2019) (81)
Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization (2021) (81)
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning (2020) (80)
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance (2018) (80)
On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length (2018) (80)
Learning Independent Features with Adversarial Nets for Non-linear ICA (2017) (79)
Deep convolutional networks for quality assessment of protein folds (2018) (78)
A Generative Process for sampling Contractive Auto-Encoders (2012) (78)
Interpretable Convolutional Filters with SincNet (2018) (77)
Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation (2014) (76)
Parallel Tempering for Training of Restricted Boltzmann Machines (2010) (75)
Big Neural Networks Waste Capacity (2013) (75)
Compositional generalization in a deep seq2seq model by separating syntax and semantics (2019) (75)
Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling (2020) (75)
Combined Reinforcement Learning via Abstract Representations (2018) (75)
Word-level training of a handwritten word recognizer based on convolutional neural networks (1994) (73)
Learning Speaker Representations with Mutual Information (2018) (73)
Spectral Clustering and Kernel PCA are Learning Eigenfunctions (2003) (73)
Adding noise to the input of a model trained with a regularized objective (2011) (73)
Hyperbolic Discounting and Learning over Multiple Horizons (2019) (73)
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation (2021) (72)
Using a Financial Training Criterion Rather than a Prediction Criterion (1997) (72)
Slow, Decorrelated Features for Pretraining Complex Cell-like Networks (2009) (72)
STDP-Compatible Approximation of Backpropagation in an Energy-Based Model (2017) (72)
Training Methods for Adaptive Boosting of Neural Networks (1997) (71)
Entropy Regularization (2006) (71)
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition (2018) (71)
Graph Neural Networks with Learnable Structural and Positional Representations (2021) (71)
Recurrent Neural Networks With Limited Numerical Precision (2016) (70)
Learning to Compute Word Embeddings On the Fly (2017) (70)
Deep Learning of Representations (2013) (70)
Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer (2018) (70)
Toward Training Recurrent Neural Networks for Lifelong Learning (2018) (70)
Label Propagation and Quadratic Criterion (2006) (68)
Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding (2018) (67)
Independently Controllable Factors (2017) (67)
Hierarchical Memory Networks (2016) (67)
Greedy Spectral Embedding (2005) (66)
DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS (2010) (66)
Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy (2019) (66)
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning (2020) (65)
CLOSURE: Assessing Systematic Generalization of CLEVR Models (2019) (64)
HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery (2020) (63)
Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization (2019) (63)
High-dimensional sequence transduction (2012) (63)
Invariant Representations for Noisy Speech Recognition (2016) (62)
Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online (2012) (62)
Non-Local Manifold Parzen Windows (2005) (61)
Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery (2012) (61)
Hybrid Models for Learning to Branch (2020) (61)
Independently Controllable Features (2017) (61)
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs (2020) (60)
Torchmeta: A Meta-Learning library for PyTorch (2019) (60)
Embedding Word Similarity with Neural Machine Translation (2014) (60)
Incorporating Functional Knowledge in Neural Networks (2009) (59)
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes (2016) (58)
Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition (2017) (58)
GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning (2019) (58)
Depth with Nonlinearity Creates No Bad Local Minima in ResNets (2018) (58)
Continuous Neural Networks (2007) (57)
The Z-coder adaptive binary coder (1998) (57)
On the Spectral Bias of Deep Neural Networks (2018) (57)
Systematic generalisation with group invariant predictions (2021) (57)
Learning Anonymized Representations with Adversarial Neural Networks (2018) (57)
Reading checks with multilayer graph transformer networks (1997) (56)
Recall Traces: Backtracking Models for Efficient Reinforcement Learning (2018) (56)
Use of genetic programming for the search of a new learning rule for neural networks (1994) (55)
Twin Networks: Matching the Future for Sequence Generation (2017) (54)
Credit Assignment through Time: Alternatives to Backpropagation (1993) (54)
Memory Augmented Neural Networks with Wormhole Connections (2017) (54)
Artificial neural networks and their application to sequence recognition (1991) (54)
A Connectionist Approach to Speech Recognition (1993) (54)
Bias learning, knowledge sharing (2003) (53)
Spectral Dimensionality Reduction (2006) (53)
Use machine learning to find energy materials (2017) (53)
Disentangling the independently controllable factors of variation by interacting with the world (2018) (53)
Cost functions and model combination for VaR-based asset allocation using neural networks (2001) (53)
Extensions to Metric-Based Model Selection (2003) (52)
GraphMix: Improved Training of GNNs for Semi-Supervised Learning (2020) (52)
STDP as presynaptic activity times rate of change of postsynaptic activity (2015) (51)
Diffusion of Context and Credit Information in Markovian Models (1995) (51)
Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization (2014) (50)
Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net (2017) (50)
Neural Production Systems (2021) (50)
On the Challenges of Physical Implementations of RBMs (2013) (50)
Coordination Among Neural Modules Through a Shared Global Workspace (2021) (49)
Adversarial Domain Adaptation for Stable Brain-Machine Interfaces (2018) (49)
AdaBoosting Neural Networks: Application to on-line Character Recognition (1997) (49)
Not All Neural Embeddings are Born Equal (2014) (49)
Low precision storage for deep learning (2014) (48)
On Adversarial Mixup Resynthesis (2019) (48)
Experiments on the application of IOHMMs to model financial returns series (2001) (47)
Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems (2020) (47)
Selective small molecule peptidomimetic ligands of TrkC and TrkA receptors afford discrete or complete neurotrophic activities. (2005) (47)
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics (2019) (46)
Object-Centric Image Generation from Layouts (2020) (46)
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning (2020) (46)
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies (2020) (45)
Learning the dynamic nature of speech with back-propagation for sequences (1992) (45)
The representational geometry of word meanings acquired by neural machine translation models (2017) (45)
Iterative Neural Autoregressive Distribution Estimator NADE-k (2014) (45)
Improving Speech Recognition by Revising Gated Recurrent Units (2017) (44)
Diet Networks: Thin Parameters for Fat Genomics (2016) (44)
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules (2020) (44)
DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation (2018) (43)
Variational Temporal Abstraction (2019) (43)
Dendritic error backpropagation in deep cortical microcircuits (2017) (43)
Nonlocal Estimation of Manifold Structure (2006) (43)
Learning Tags that Vary Within a Song (2010) (42)
Scaling Large Learning Problems with Hard Parallel Mixtures (2002) (42)
On the interplay between noise and curvature and its effect on optimization and generalization (2019) (42)
Large-Scale Learning of Embeddings with Reconstruction Sampling (2011) (42)
Bias in Estimating the Variance of K-Fold Cross-Validation (2005) (41)
Towards a Biologically Plausible Backprop (2016) (41)
Training End-to-End Analog Neural Networks with Equilibrium Propagation (2020) (41)
Word normalization for on-line handwritten word recognition (1994) (41)
Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations (2018) (40)
GSNs : Generative Stochastic Networks (2015) (40)
Introduction to the special issue on neural networks for data mining and knowledge discovery (2000) (40)
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives (2019) (40)
Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies (2019) (40)
Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models (2019) (40)
A network of deep neural networks for Distant Speech Recognition (2017) (39)
Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes (2018) (39)
Early Inference in Energy-Based Models Approximates Back-Propagation (2015) (39)
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future (2019) (39)
11 Label Propagation and Quadratic Criterion (38)
ReSeg: A Recurrent Neural Network for Object Segmentation (2015) (38)
Bayesian Structure Learning with Generative Flow Networks (2022) (38)
Scaling Up Spike-and-Slab Models for Unsupervised Feature Learning (2013) (38)
Inherent privacy limitations of decentralized contact tracing apps (2020) (38)
Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask (2017) (37)
Quadratic Features and Deep Architectures for Chunking (2009) (37)
Use machine learning to find energy materials. (2017) (37)
Mollifying Networks (2016) (36)
Small-GAN: Speeding Up GAN Training Using Core-sets (2019) (36)
Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing Its Gradient Estimator Bias (2020) (36)
Meta-learning framework with applications to zero-shot time-series forecasting (2020) (36)
Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models (1993) (36)
Probabilistic Planning with Sequential Monte Carlo methods (2018) (35)
Evolving Culture Versus Local Minima (2014) (35)
Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures (2019) (35)
Task Loss Estimation for Sequence Prediction (2015) (35)
Biological Sequence Design with GFlowNets (2022) (35)
Online continual learning with no task boundaries (2019) (35)
Interactive Language Learning by Question Answering (2019) (35)
HNHN: Hypergraph Networks with Hyperedge Neurons (2020) (34)
The Causal-Neural Connection: Expressiveness, Learnability, and Inference (2021) (34)
Editorial introduction to the Neural Networks special issue on Deep Learning of Representations (2015) (34)
Representation Mixing for TTS Synthesis (2018) (33)
Fraternal Dropout (2017) (33)
Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs (2010) (33)
Generative Flow Networks for Discrete Probabilistic Modeling (2022) (33)
InfoMask: Masked Variational Latent Representation to Localize Chest Disease (2019) (33)
Convolutional neural networks for mesh-based parcellation of the cerebral cortex (2018) (33)
On the search for new learning rules for ANNs (1995) (33)
Batch-normalized joint training for DNN-based distant speech recognition (2016) (32)
An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming (2021) (32)
Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models (2004) (32)
DEUP: Direct Epistemic Uncertainty Prediction (2021) (32)
On the Learning Dynamics of Deep Neural Networks (2018) (32)
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning (2021) (32)
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets (2019) (32)
Learning Eigenfunctions of Similarity: Linking Spectral Clustering and Kernel PCA (2003) (32)
Contextual tag inference (2011) (31)
Evolving Culture vs Local Minima (2012) (31)
Universal Successor Representations for Transfer Reinforcement Learning (2018) (31)
An EM Algorithm for Asynchronous Input/Output Hidden Markov Models (1996) (31)
Equivalence of Equilibrium Propagation and Recurrent Backpropagation (2017) (31)
Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines (2013) (30)
Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information (2018) (30)
Manifold Mixup: Learning Better Representations by Interpolating Hidden States (2018) (30)
Towards Gene Expression Convolutions using Gene Interaction Graphs (2018) (30)
Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning (2014) (30)
Saliency is a Possible Red Herring When Diagnosing Poor Generalization (2021) (30)
Adaptive Drift-Diffusion Process to Learn Time Intervals (2011) (30)
Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions (2012) (29)
Trajectory Balance: Improved Credit Assignment in GFlowNets (2022) (28)
A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies (2018) (28)
Speech and Speaker Recognition from Raw Waveform with SincNet (2018) (28)
h-detach: Modifying the LSTM Gradient Towards Better Optimization (2018) (28)
Quick Training of Probabilistic Neural Nets by Importance Sampling (2003) (28)
Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks (2020) (27)
Regularized Auto-Encoders Estimate Local Statistics (2012) (27)
The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions (2014) (27)
Joint Training of Deep Boltzmann Machines (2012) (27)
On Tracking The Partition Function (2011) (27)
Rethinking Distributional Matching Based Domain Adaptation (2020) (27)
Statistical Learning Algorithms Applied to Automobile Insurance Ratemaking (2003) (27)
Finding Flatter Minima with SGD (2018) (27)
Discriminative Non-negative Matrix Factorization for Multiple Pitch Estimation (2012) (27)
Building Musically-relevant Audio Features through Multiple Timescale Representations (2012) (27)
Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input (2019) (27)
Tractable Multivariate Binary Density Estimation and the Restricted Boltzmann Forest (2010) (26)
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference (2001) (26)
Brain Inspired Reinforcement Learning (2004) (26)
Programmable execution of multi-layered networks for automatic speech recognition (1989) (26)
Learning Neural Causal Models with Active Interventions (2021) (26)
Variational Causal Networks: Approximate Bayesian Inference over Causal Structures (2021) (26)
Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks (2019) (25)
Commonsense mining as knowledge base completion? A study on the impact of novelty (2018) (25)
A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions (2009) (24)
Joint Training Deep Boltzmann Machines for Classification (2013) (24)
Focused Hierarchical RNNs for Conditional Sequence Processing (2018) (24)
Learning from Partial Labels with Minimum Entropy (2004) (24)
Generalization of Equilibrium Propagation to Vector Field Dynamics (2018) (24)
Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks (1991) (24)
Alternative time representation in dopamine models (2010) (23)
Fast and Slow Learning of Recurrent Independent Mechanisms (2021) (23)
Gradient-based Learning Applied to Document Recognition Gt Graph Transformer. Gtn Graph Transformer Network. Hmm Hidden Markov Model. Hos Heuristic Oversegmentation. K-nn K-nearest Neighbor. Nn Neural Network. Ocr Optical Character Recognition. Pca Principal Component Analysis. Rbf Radial Basis Func (1998) (23)
Word normalization for online handwritten word recognition (1994) (23)
How does hemispheric specialization contribute to human-defining cognition? (2021) (23)
Learning Causal Models Online (2020) (23)
On the Iterative Refinement of Densely Connected Representation Levels for Semantic Segmentation (2018) (23)
Bounding the Test Log-Likelihood of Generative Models (2013) (22)
Bidirectional Helmholtz Machines (2015) (22)
Discrete-Valued Neural Communication (2021) (22)
Learning from unexpected events in the neocortical microcircuit (2021) (22)
Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution (2022) (22)
Augmented Functional Time Series Representation and Forecasting with Gaussian Processes (2007) (22)
Chunked Autoregressive GAN for Conditional Waveform Synthesis (2021) (22)
Discriminative feature and model design for automatic speech recognition (1997) (22)
A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition (2019) (22)
Robust Regression with Asymmetric Heavy-Tail Noise Distributions (2002) (22)
The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach (2018) (21)
Diffusion of Credit in Markovian Models (1994) (21)
Learning the 2-D Topology of Images (2007) (21)
Deriving Differential Target Propagation from Iterating Approximate Inverses (2020) (21)
Efficient EM Training of Gaussian Mixtures with Missing Data (2012) (21)
Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay (2020) (21)
Attention Based Pruning for Shift Networks (2019) (20)
Topic Segmentation : A First Stage to Dialog-Based Information Extraction (2001) (20)
Towards Standardization of Data Licenses: The Montreal Data License (2019) (20)
Locally Linear Embedding for dimensionality reduction in QSAR (2004) (20)
PROC OF THE IEEE NOVEMBER Gradient Based Learning Applied to Document Recognition (2006) (20)
Phonetically-based multi-layered neural networks for vowel classification (1990) (20)
A robust adaptive stochastic gradient method for deep learning (2017) (20)
On the challenge of learning complex functions. (2007) (20)
Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs (2013) (20)
On Training Recurrent Neural Networks for Lifelong Learning (2018) (20)
Support vector machines for improving the classification of brain PET images (1998) (20)
Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning (2018) (19)
Deep Self-Taught Learning for Handwritten Character Recognition (2010) (19)
MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation (2018) (19)
Improving First and Second-Order Methods by Modeling Uncertainty (2010) (19)
An EM approach to grammatical inference: input/output HMMs (1994) (19)
Modeling the Long Term Future in Model-Based Reinforcement Learning (2018) (19)
Perceptual Generative Autoencoders (2019) (19)
Problems in the deployment of machine-learned models in health care (2021) (19)
Browsing through high quality document images with DjVu (1998) (18)
Natural Gradient Revisited (2013) (18)
GradMask: Reduce Overfitting by Regularizing Saliency (2019) (18)
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget (2020) (18)
Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible (2016) (18)
On Training Deep Boltzmann Machines (2012) (18)
Diet Networks: Thin Parameters for Fat Genomic (2016) (18)
Unsupervised Learning of Semantics of Object Detections for Scene Categorization (2013) (17)
Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge (1989) (17)
The effects of negative adaptation in Model-Agnostic Meta-Learning (2018) (17)
Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio (2018) (17)
How can deep learning advance computational modeling of sensory information processing? (2018) (17)
Equilibrium Propagation with Continual Weight Updates (2019) (17)
Keep Drawing It: Iterative language-based image generation and editing (2018) (16)
Dynamic Inference with Neural Interpreters (2021) (16)
Information matrices and generalization (2019) (16)
Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network (1990) (16)
Variational Bi-LSTMs (2017) (16)
A Neural Support Vector Network architecture with adaptive kernels (2000) (16)
Transformers with Competitive Ensembles of Independent Mechanisms (2021) (16)
COVI White Paper (2020) (16)
Autotagging music with conditional restricted Boltzmann machines (2011) (16)
Continuous optimization of hyper-parameters (2000) (16)
Reinforcement Learning for Sustainable Agriculture (2019) (16)
ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient (2014) (16)
The Benefits of Over-parameterization at Initialization in Deep ReLU Networks (2019) (16)
On the Morality of Artificial Intelligence (2019) (15)
Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning (2021) (15)
Deep Directed Generative Autoencoders (2014) (15)
Use of neural networks for the recognition of place of articulation (1988) (15)
Twin Regularization for online speech recognition (2018) (15)
Compositional Generalization by Factorizing Alignment and Translation (2020) (14)
Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks (2017) (14)
The First Conversational Intelligence Challenge (2018) (14)
An EM Approach to Learning Sequential (1994) (14)
The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All) (2011) (14)
RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design (2020) (14)
Learning invariant features through local space contraction (2011) (14)
Twin Networks: Using the Future as a Regularizer (2017) (14)
Generalizable Features From Unsupervised Learning (2016) (14)
Learning semantic representations of objects and their parts (2014) (14)
Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2018) (14)
Learned-norm pooling for deep neural networks (2013) (14)
Reinforced Imitation in Heterogeneous Action Space (2019) (13)
Input decay: simple and effective soft variable selection (2001) (13)
A memory-efficient adaptive Huffman coding algorithm for very large sets of symbols (1998) (13)
An objective function for STDP (2015) (13)
A hybrid coder for hidden Markov models using a recurrent neural networks (1990) (13)
NU-GAN: High resolution neural upsampling with GAN (2020) (13)
Universal Successor Features for Transfer Reinforcement Learning (2018) (13)
Generalization of a Parametric Learning Rule (1993) (13)
BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization (2020) (13)
DETONATION CLASSIFICATION FROM ACOUSTIC SIGNATURE WITH THE RESTRICTED BOLTZMANN MACHINE (2012) (13)
Extending the Framework of Equilibrium Propagation to General Dynamics (2018) (13)
Unifying Generative Models with GFlowNets (2022) (12)
How Transferable Are Features in Convolutional Neural Network Acoustic Models across Languages? (2019) (12)
A Deep Reinforcement Learning Chatbot (Short Version) (2018) (12)
On Catastrophic Interference in Atari 2600 Games (2020) (12)
S2RMs: Spatially Structured Recurrent Modules (2020) (12)
GibbsNet: Iterative Adversarial Inference for Deep Graphical Models (2017) (12)
A learning-based algorithm to quickly compute good primal solutions for Stochastic Integer Programs (2019) (12)
FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters (2021) (12)
Boundary Seeking GANs (2018) (12)
Plan, Attend, Generate: Planning for Sequence-to-Sequence Models (2017) (12)
Conditioning and time representation in long short-term memory networks (2014) (12)
Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences (2018) (12)
Target Propagation (2015) (12)
Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders (2012) (11)
Statistical Machine Learning Algorithms for Target Classification from Acoustic Signature (2009) (11)
Missing Data with Recurrent Networks Handling Asynchronous or Missing Data with Recurrent Networks (1998) (11)
HighRes-net: Multi-Frame Super-Resolution by Recursive Fusion (2019) (11)
Predicting Infectiousness for Proactive Contact Tracing (2020) (11)
Image Segmentation by Iterative Inference from Conditional Score Estimation (2017) (11)
Weakly Supervised Representation Learning with Sparse Perturbations (2022) (11)
Scaling up deep learning (2014) (11)
The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis (2019) (11)
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (2021) (11)
Is a Modular Architecture Enough? (2022) (10)
Dynamic Frame Skipping for Fast Speech Recognition in Recurrent Neural Network Based Acoustic Models (2018) (10)
Unsupervised and Transfer Learning under Uncertainty - From Object Detections to Scene Categorization (2013) (10)
Weakly-supervised Knowledge Graph Alignment with Adversarial Learning (2019) (10)
Multimodal Transitions for Generative Stochastic Networks (2013) (10)
Untangling tradeoffs between recurrence and self-attention in artificial neural networks (2020) (10)
Locally Weighted Full Covariance Gaussian Density Estimation (2004) (10)
Automated segmentation of cortical layers in BigBrain reveals divergent cortical and laminar thickness gradients in sensory and motor cortices. (2019) (10)
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach (2020) (10)
Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (9)
On the Equivalence between Deep NADE and Generative Stochastic Networks (2014) (9)
GFlowNet Foundations (2021) (9)
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation (1991) (9)
CACHE (Critical Assessment of Computational Hit-finding Experiments): A public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding (2021) (9)
Learning GFlowNets from partial episodes for improved convergence and stability (2022) (9)
A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM (2020) (9)
Generalization in Machine Learning via Analytical Learning Theory (2018) (9)
Deep learning and cultural evolution (2014) (9)
Untangling tradeoffs between recurrence and self-attention in neural networks (2020) (8)
Compositional Attention: Disentangling Search and Retrieval (2021) (8)
Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes (2022) (8)
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy (2019) (8)
TRAINING A NEURAL NETWORK WITH A FINANCIAL CRITERION RATHER THAN A PREDICTION CRITERION (2007) (8)
An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism (2009) (8)
ACtuAL: Actor-Critic Under Adversarial Learning (2017) (8)
From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence (2021) (8)
Oracle Performance for Visual Captioning (2015) (8)
Machines Who Learn. (2016) (8)
How to construct deep recurrent neural networks: Proceedings of the Second International Conference on Learning Representations (ICLR 2014) (2014) (8)
Multiscale sequence modeling with a learned dictionary (2017) (8)
Iteratively unveiling new regions of interest in Deep Learning models (2018) (8)
A3T: Adversarially Augmented Adversarial Training (2018) (8)
The Octopus Approach to the Alexa Competition : A Deep Ensemble-based Socialbot (2017) (8)
Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning (2017) (7)
Structured Sparsity Inducing Adaptive Optimizers for Deep Learning (2021) (7)
Discovering Shared Structure in Manifold Learning (2004) (7)
Generative Augmented Flow Networks (2022) (7)
A Generative Process for Contractive Auto-Encoders (2012) (7)
Generating Multiscale Amorphous Molecular Structures Using Deep Learning: A Study in 2D. (2020) (7)
Towards Scaling Difference Target Propagation by Learning Backprop Targets (2022) (7)
Multi-scale Feature Learning Dynamics: Insights for Double Descent (2021) (7)
Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks (2020) (7)
GFlowNets and variational inference (2022) (7)
COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing (2020) (7)
Non-parametric Regression between Riemannian Manifolds (2009) (7)
Conditional Computation for Continual Learning (2019) (7)
Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya (2020) (7)
Modularity Matters: Learning Invariant Relational Reasoning Tasks (2018) (7)
A simple and general method for semi-supervised learning (2010) (7)
Découpage thématique des conversations : un outil d'aide à l'extraction (2002) (7)
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments (2021) (7)
An Analysis of the Adaptation Speed of Causal Models (2020) (7)
Deep Learning for Automatic Summary Scoring (2012) (7)
Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition (1988) (6)
NYU-MILA Neural Machine Translation Systems for WMT’16 (2016) (6)
Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination (2002) (6)
Learning the Arrow of Time for Problems in Reinforcement Learning (2020) (6)
Suitability of V1 Energy Models for Object Classification (2011) (6)
On random weights for texture generation in one layer CNNS (2017) (6)
Catalyzing next-generation Artificial Intelligence through NeuroAI (2023) (6)
RetroGNN: Fast Estimation of Synthesizability for Virtual Screening and De Novo Design by Learning from Slow Retrosynthesis Software (2022) (6)
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro (2022) (6)
Unifying Likelihood-free Inference with Black-box Optimization and Beyond (2021) (6)
Establishing an evaluation metric to quantify climate change image realism (2019) (6)
A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions (2022) (6)
A Dataset of Topic-Oriented Human-to-Chatbot Dialogues (2018) (6)
Variance Regularizing Adversarial Learning (2017) (6)
Big Data: Theoretical Aspects [Scanning the Issue] (2016) (6)
Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL (2022) (6)
Discussion of "The Neural Autoregressive Distribution Estimator" (2011) (6)
Learning Powerful Policies by Using Consistent Dynamics Model (2019) (5)
Modeling Natural Image Covariance with a Spike and Slab Restricted Boltzmann Machine (2010) (5)
Towards Understanding Generalization via Analytical Learning Theory (2018) (5)
Using Artificial Intelligence to Visualize the Impacts of Climate Change (2021) (5)
Predicting Unreliable Predictions by Shattering a Neural Network (2021) (5)
On the Morality of Artificial Intelligence [Commentary] (2020) (5)
A semantic matching energy function for learning with multi-relational data (2013) (5)
ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods (2021) (5)
Understanding deep architectures and the effect of unsupervised pre-training (2011) (5)
Use of Multi-Layered Networks for Coding Speech with Phonetic Features (1988) (5)
Combining Model-based and Model-free RL via Multi-step Control Variates (2018) (5)
hBERT + BiasCorp - Fighting Racism on the Web (2021) (5)
Binary pseudowavelets and applications to bilevel image processing (1999) (5)
On-line handwriting recognition with neural networks: Spatial representation versus temporal representation (1993) (5)
Ghost Units Yield Biologically Plausible Backprop in Deep Neural Networks (2019) (5)
Deep Tempering (2014) (5)
Mastering Rate based Curriculum Learning (2020) (5)
Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings (2011) (5)
Combating False Negatives in Adversarial Imitation Learning (2020) (5)
Shared Context Probabilistic Transducers (1997) (5)
On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties (1989) (5)
Underwhelming Generalization Improvements From Controlling Feature Attribution (2019) (5)
Systematicity in a Recurrent Neural Network by Factorizing Syntax and Semantics (2020) (5)
A Two-Stream Continual Learning System With Variational Domain-Agnostic Feature Replay (2021) (5)
On the Generalization and Adaption Performance of Causal Models (2022) (5)
Predictive Inference with Feature Conformal Prediction (2022) (5)
Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond (2021) (5)
Probabilistic neural network models for sequential data (2000) (4)
The Variational Walkback Algorithm (2016) (4)
MAgNet: Mesh Agnostic Neural PDE Solver (2022) (4)
Learning the Arrow of Time (2019) (4)
Exploration-Driven Representation Learning in Reinforcement Learning (2021) (4)
Empirical performance upper bounds for image and video captioning (2015) (4)
Building Robust Ensembles via Margin Boosting (2022) (4)
A theory of continuous generative flow networks (2023) (4)
Forecasting and Trading Commodity Contract Spreads with Gaussian Processes (2007) (4)
Trainable performance upper bounds for image and video captioning (2015) (4)
Cross-Modal Information Maximization for Medical Imaging: CMIM (2020) (4)
On Out-of-Sample Statistics for Time-Series (2002) (4)
Use of multilayer networks for the recognition of phonetic features and phonemes (1989) (4)
A Hybrid Pareto Model for Conditional Density Estimation of Asymmetric Fat-Tail Data (2007) (4)
Learning from Learning Machines: Optimisation, Rules, and Social Norms (2019) (4)
Latent State Marginalization as a Low-cost Approach for Improving Exploration (2022) (4)
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers (2020) (4)
Discrete Key-Value Bottleneck (2022) (4)
Using Simulated Data to Generate Images of Climate Change (2020) (4)
CMIM: Cross-Modal Information Maximization For Medical Imaging (2021) (3)
Supplementary material for : How transferable are features in deep neural networks ? (2014) (3)
Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features (2021) (3)
The Journey is the Reward: Unsupervised Learning of Influential Trajectories (2019) (3)
Graph-Based Semi-Supervised Learning (2005) (3)
Unifying Generative Models with GFlowNets and Beyond (2022) (3)
Visual Concept Reasoning Networks (2020) (3)
Learning to rank for censored survival data (2018) (3)
Reinforced Imitation Learning from Observations (2018) (3)
Deep Architectures for Baby AI (2007) (3)
An Actor-Critic Algorithm for Structured Prediction (2016) (3)
Connectionist Models and their Application to Automatic Speech Recognition (1991) (3)
State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations (2019) (3)
Conditional Flow Matching: Simulation-Free Dynamic Optimal Transport (2023) (3)
COVI White Paper-Version 1.1 (2020) (3)
A Walk with SGD: How SGD Explores Regions of Deep Network Loss? (2018) (3)
Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints (2022) (3)
Valorisation d'Options par Optimisation du Sharpe Ratio (2002) (3)
From STDP towards Biologically Plausible Deep Learning (2015) (3)
Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization (2007) (3)
On Random Weights for Texture Generation in One Layer Neural Networks (2016) (3)
FL Games: A federated learning framework for distribution shifts (2022) (3)
Blocks and Fuel (2015) (3)
Deep Learning. Das umfassende Handbuch (2018) (3)
Stochastic Learning of Strategic Equilibria for Auctions (1999) (3)
Joint Learning of Generative Translator and Classifier for Visually Similar Classes (2019) (3)
Large-Scale Algorithms (2006) (3)
Interventional Causal Representation Learning (2022) (3)
Better Training of GFlowNets with Local Credit and Incomplete Trajectories (2023) (3)
Training Bidirectional Helmholtz Machines (2015) (3)
Applying Knowledge Transfer for Water Body Segmentation in Peru (2019) (3)
GFlowOut: Dropout with Generative Flow Networks (2022) (3)
Continuous-Time Meta-Learning with Forward Mode Differentiation (2022) (3)
Unsupervised one-to-many image translation (2018) (3)
On Neural Architecture Inductive Biases for Relational Tasks (2022) (3)
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition (2020) (3)
A Hybrid Pareto Model for Asymmetric Fat-Tail Data (2006) (3)
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization (2022) (3)
Régularisation du prix des options : Stacking (2002) (2)
Low-memory convolutional neural networks through incremental depth-first processing (2018) (2)
BabyAI 1.1 (2020) (2)
Latent Bottlenecked Attentive Neural Processes (2022) (2)
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning (2022) (2)
AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N (2022) (2)
Radial Basis Functions for Speech Recognition (1992) (2)
Extracting Hidden Sense Probabilities from Bitexts (2003) (2)
Task Loss Estimation for Structured Prediction (2016) (2)
Learning Simple Non Stationarities with Hyper Parameters (1999) (2)
USE OF NEURAL NETWORKS FOR THE RECOGNITION OF PLACE (1988) (2)
The Effect of Diversity in Meta-Learning (2022) (2)
Marathi Handwritten Numeral Recognition using Fourier Descriptors and Normalized Chain Code (2017) (2)
MEMORY-EFFICIENT ADAPTIVE HUFFMAN CODING (1998) (2)
Multi-Task Learning For Option Pricing (2002) (2)
Multi-Objective GFlowNets (2022) (2)
Document Analysis with Transducers (2015) (2)
The representational geometry of word meanings acquired by neural machine translation models (2017) (2)
Training opposing directed models using geometric mean matching (2015) (2)
Spatially Structured Recurrent Modules (2021) (2)
Distributed Representation Prediction for Generalization to New Words (2006) (2)
Sparse Attentive Backtracking : Towards Efficient Credit Assignment In Recurrent Networks (2017) (2)
Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models (2021) (2)
Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks (2020) (2)
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization (2022) (2)
Agnostic Physics-Driven Deep Learning (2022) (2)
Workshop summary: Workshop on learning feature hierarchies (2009) (2)
Apprentissage machine efficace: theorie et pratique (2012) (2)
A Common GPU n-Dimensional Array for Python and C (2011) (2)
Generalizing to a zero-data task : a computational chemistry case study (2006) (2)
Gaussian Mixture Densities for Classification of Nuclear Power Plant Data (1998) (2)
Speech coding with multilayer networks (1989) (2)
SGD Smooths The Sharpest Directions (2018) (2)
Comparative Study of Learning Outcomes for Online Learning Platforms (2021) (2)
Lookback for Learning to Branch (2022) (2)
Estimators of Variance for K-Fold Cross-Validation (2003) (2)
Étude du biais dans le prix des options (2002) (1)
GraphMix: Improved Training of Graph Neural Networks for Semi-Supervised Learning (2020) (1)
Codon arrangement modulates MHC-I peptides presentation (2020) (1)
Towards the Latent Transcriptome (2018) (1)
Segmentation en thèmes de conversations téléphoniques : traitement en amont pour l’extraction d’information (2002) (1)
Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007 (2008) (1)
Markovian Models for Sequential (2004) (1)
Proactive Contact Tracing (2023) (1)
Gaussian Mixtures with Missing Data: an Ecien t EM Training Algorithm (1994) (1)
MixupE: Understanding and Improving Mixup from Directional Derivative Perspective (2022) (1)
Posterior samples of source galaxies in strong gravitational lenses with score-based priors (2022) (1)
Automated curriculum generation for Policy Gradients from Demonstrations (2019) (1)
Convergence Properties of Deep Neural Networks on Separable Data (2018) (1)
Comment améliorer la capacité de généralisation des algorithmes d'apprentissage pour la prise de décisions financières (2003) (1)
Incorporating complex cells into neural networks for pattern classification (2011) (1)
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning (2022) (1)
Avoidance Learning Using Observational Reinforcement Learning (2019) (1)
Mode Regularized Generative Adversarial (2016) (1)
InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck (2019) (1)
Approche statistique pour le repérage de mots informatifs dans les textes oraux (2004) (1)
Pattern Recognition (1998) (1)
Regeneration Learning: A Learning Paradigm for Data Generation (2023) (1)
Sharp Minima Can Generalize For Deep Nets Supplementary Material (2017) (1)
Exploring the Wasserstein metric for time-to-event analysis (2021) (1)
Convergence Properties of the K-means Algorithms L Eon Bottou (1995) (1)
Deep Learning for NLP (without Magic) References (2012) (1)
Equivariance with Learned Canonicalization Functions (2022) (1)
Sources of Richness and Ineffability for Phenomenally Conscious States (2023) (1)
Introduction to NIPS 2017 Competition Track (2018) (1)
Extended Semantic Tagging for Entity Extraction (1)
Synergies Between Disentanglement and Sparsity: a Multi-Task Learning Perspective (2022) (1)
Distributional GFlowNets with Quantile Flows (2023) (1)
A Neural Network to Detect Homologies in Proteins (1989) (1)
Rethinking Learning Dynamics in RL using Adversarial Networks (2022) (1)
Robust and Controllable Object-Centric Learning through Energy-based Models (2022) (1)
Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning (2022) (1)
Metric-based model selection for time-series forecasting (2002) (1)
Bayesian Dynamic Causal Discovery (2022) (1)
Speech coding with multi-layer networks (1989) (1)
Predicting ice flow using machine learning (2019) (1)
On Out-of-Sample Statistics for Financial Time-Series (2002) (1)
Statistical Language and Speech Processing (2013) (1)
J un 2 01 3 Deep Learning of Representations : Looking Forward (2013) (1)
The Challenge of Non-Linear Regression on Large Datasets with Asymmetric Heavy Tails (2002) (0)
Information Fusion in Deep Convolutional Neural Networks for Biomedical Image Segmentation 1 (2018) (0)
Learning Generative Models with Locally Disentangled Latent Factors (2018) (0)
18 Large-Scale Algorithms (0)
Learning Latent Multiscale Structure Using Recurrent Neural Networks (2016) (0)
PAST DSAA KEYNOTE SPEAKERS (2020) (0)
How do We Train Deep Architectures ? (2009) (0)
Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One (2022) (0)
Multi-Domain Balanced Sampling Improves Out-of- Generalization of Chest X-ray Pathology Prediction Models (2021) (0)
Repérage de mots informatifs dans les textes conversationnels (2004) (0)
»Deep Learning ist keine Religion« (2018) (0)
Graph Priors for Deep Neural Networks (2018) (0)
Automated Detection of Anatomical Landmarks During Colonoscopy Using a Deep Learning Model (2023) (0)
UOUS AND DISCRETE ADDRESSING SCHEMES (2016) (0)
IAPR keynote lecture IV: Deep learning (2015) (0)
On learning distributed representations of semantics (2011) (0)
Recurrent Neural Networks for Adaptive Temporal ProcessingYoshua Bengio (1993) (0)
F IT N ETS : H INTS FOR T HIN D EEP N ETS (2015) (0)
On summarized validation curves and generalization (2019) (0)
Marathi Handwritten Numeral Recognition using Zernike Moments and Fourier Descriptors (2020) (0)
Optimization of Artificial Neural Network Hyperparameters For Processing Retrospective Information (2021) (0)
Forecasting Non-Stationary Volatility with Hyper-Parameters (2002) (0)
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning (2020) (0)
Artificial Intelligence Cytometer in Blood (2019) (0)
CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation (2020) (0)
Deep Meditations : Controlled navigation of latent space (2018) (0)
Les données au service du savoir (2017) (0)
TRANSFER REINFORCEMENT LEARNING (2018) (0)
Learning powerful policies and better dynamics models by encouraging consistency (2018) (0)
SCANNING THE ISSUE Big Data : Theoretical Aspects (2015) (0)
Learning Neural Generative Dynamics for Molecular Conformation Generation (2021) (0)
The K Best-Paths Approach to Approximate Dynamic Programming with Application to Portfolio Optimization (2006) (0)
On the Optimization of a Synaptic LearningRuleSamy (1997) (0)
L G ] 2 9 D ec 2 01 8 Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks (2019) (0)
EnGAN: Latent Space MCMC and Maximum Entropy Generators for Energy-based Models (2018) (0)
Proposed Architectural and Representational Modifications (2021) (0)
Markovian Models for Sequential DataYoshua (1996) (0)
2 The Curse of Dimensionality for Classical Non-Parametric Models (0)
WARDS BETTER OPTIMIZATION (2019) (0)
Artificial Intelligence Pioneers But making those quantum leaps from science fiction to reality required hard work from computer scientists like (0)
M L ] 2 0 A ug 2 01 3 Pylearn 2 : a machine learning research library (2014) (0)
Learning of Sophisticated Curriculums by viewing them as Graphs over Tasks (2018) (0)
SGD S MOOTHS THE S HARPEST D IRECTIONS (2018) (0)
Meta Attention Networks: Meta Learning Attention To Modulate Information Between Sparsely Interacting Recurrent Modules (2020) (0)
Generalization of a Parametric LearningRule (1993) (0)
Depthwith nonlinearity creates no bad localminima in ResNets (2019) (0)
Combating False Negatives in Adversarial Imitation Learning (Student Abstract) (2020) (0)
LATTER M INIMA WITH SGD (2018) (0)
Pen-based visitor registration system (PENGUIN) (1994) (0)
Part I Feature Extraction Fundamentals 11 Ensembles of Regularized Least Squares Classifiers for High-dimensional Problems 15 Tree-based Ensembles with Dynamic Soft Feature Selection 18 Bayesian Support Vector Machines for Feature Ranking and Selection 21 Feature Selection via Sensitivity Analysis w (0)
Pruning for efficient hardware implementations of deep neural networks (2020) (0)
Image-to-image Mapping with Many Domains by Sparse Attribute Transfer (2020) (0)
Proposed Algorithm : Algorithm (2007) (0)
Model Sele tion for Small Sample (2000) (0)
Generalization (2020) (0)
{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery (2019) (0)
Aprendizaje profundo. Tras años de decepciones, la inteligencia artiñcial está empezando a cumplir lo que prometia en sus comienzos gracias a esta potente técnica (2016) (0)
Extended Abstract Track Object-Centric Causal Representation Learning (2022) (0)
Generalization to a zero-data task: an empirical study (0)
Proceedings of the 22nd International Conference on Neural Information Processing Systems (2009) (0)
DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks (2023) (0)
Artificial Intelligence Based Cloud Distributor (AI-CD): Probing Low Cloud Distribution with Generative Adversarial Neural Networks (2019) (0)
Extending Metric-Based Model Selection and Regularization in the Absence of Unlabeled Data (0)
Conditioning and time representation in long short-term memory networks (2013) (0)
VIM: Variational Independent Modules for Video Prediction (2022) (0)
Université de Montréal Estimating the probability of a fleet vehicle accident: A deep learning approach using Conditional Variational Auto-Encoders (2020) (0)
SUPPLEMENTARY MATERIAL-LEARNING TO NAVIGATE THE SYNTHETICALLY ACCESSIBLE CHEMICAL SPACE USING REINFORCEMENT LEARNING (2020) (0)
PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design (2022) (0)
RNNLOGIC: LEARNING LOGIC RULES FOR REASON- (2020) (0)
On the Use of an Ear Model and Multi-Layered Networks for Automatic Speech Recognition (1990) (0)
Combining Parameter-efficient Modules for Task-level Generalisation (2023) (0)
O BJECT - CENTRIC C OMPOSITIONAL I MAGINATION FOR V ISUAL A BSTRACT R EASONING (2022) (0)
Neural Production Systems: Learning Rule-Governed Visual Dynamics (2021) (0)
Learning Classical Planning Transition Functions by Deep Neural Networks (2020) (0)
Stochastic Generative Flow Networks (2023) (0)
Coordinating Policies Among Multiple Agents via an Intelligent Communication Channel (2022) (0)
Reusable Slotwise Mechanisms (2023) (0)
Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions (2022) (0)
Contrastive introspection (ConSpec) to rapidly identify invariant steps for success (2022) (0)
Leveraging the Third Dimension in Contrastive Learning (2023) (0)
MIREX TAGGING CONTEST : A DEEP NEURAL NET APPROACH ( DRAFT ) (2008) (0)
FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for Federated Learning on Non-IID Data (2022) (0)
Problèmes associés au déploiement des modèles fondés sur l’apprentissage machine en santé (2021) (0)
Learning semantic representations of objects and their parts (2013) (0)
Estimating Car Insuran e Premia : a Case Study in High-Dimensional (2013) (0)
I NDUCTIVE B IASES FOR R ELATIONAL T ASKS (2022) (0)
Supplemental Material for : Deep Generative Stochastic Networks Trainable by Backprop (2014) (0)
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning (2022) (0)
A comparative study on hybrid acoustic phonetic decoders based on artificial neural networks (1991) (0)
L EARNING THE A RROW OF T IME FOR P ROBLEMS IN R EINFORCEMENT L EARNING (2020) (0)
Université de Montréal Balancing Signals for Semi-Supervised Sequence Learning (2020) (0)
Former NASA chief unveils $ 100 million neural chip maker KnuEdge (2016) (0)
Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation (2020) (0)
IGURE QA : A N A NNOTATED F IGURE D ATASET FOR V ISUAL R EASONING (2018) (0)
A General Purpose Neural Architecture for Geospatial Systems (2022) (0)
GFlowNet-EM for learning compositional latent variable models (2023) (0)
BigBrain: 1D convolutional neural networks for automated sementation of cortical layers (2018) (0)
Reassuring and Troubling Views on Graph-Based Semi-Supervised Learning (2005) (0)
The AI Driving Olympics at NIPS 2018 (0)
Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection (2022) (0)
Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs (2020) (0)
Towards more hardware-friendly deep learning (2017) (0)
An Energy-Based Recurrent Neural Network for Multiple Fundamental Frequency Estimation (2011) (0)
Hyena Hierarchy: Towards Larger Convolutional Language Models (2023) (0)
GFlowNets for AI-Driven Scientific Discovery (2023) (0)
Exploring the Wasserstein metric for survival analysis (2021) (0)
A Neuronal Least-Action Principle for Real-Time Learning in Cortical Circuits (2023) (0)
Machine Learning (2021) (0)
Episodes Meta Sequence S 2 Fast Update Slow Update Fast Update Slow Update (2021) (0)
Estimation de densité conditionnelle lorsque l'hypothèse de normalité est insatisfaisante (2004) (0)
Scalable Neural Network Algorithms for High Dimensional Data (2023) (0)
Bayesian Structure Learning with Generative Flow Networks (Supplementary material) (2022) (0)
Collaborative filtering techniques for drug discovery par 7 M / t ( 3 ’ / 7 (2016) (0)
Neural Attentive Circuits (2022) (0)
Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning (2022) (0)
(Private)-Retroactive Carbon Pricing [(P)ReCaP]: A Market-based Approach for Climate Finance and Risk Assessment (2022) (0)
Contrastive introspection to identify critical steps in reinforcement learning (2022) (0)
O ct 2 01 9 S MALL-GAN : S PEEDING UP GAN T RAINING USING C ORES ETS (2019) (0)
E VALUATING G ENERALIZATION IN GF LOW N ETS FOR M OLECULE D ESIGN (2022) (0)
Boosting Exploration in Multi-Task Reinforcement Learning using Adversarial Networks (2022) (0)
Proceedings of the 21st International Conference on Neural Information Processing Systems (2008) (0)
Stochastic Gradient Descent on a Portfolio Management Training Criterion Using the IPA Gradient Estimator (2003) (0)
EVALUATING LONG-TERM DEPENDENCYBENCHMARK PROBLEMS BY RANDOM GUESSINGJ (2001) (0)
SPECTRA: Sparse Entity-centric Transitions (2019) (0)
Stacked calibration of off-policy policy evaluation for video game matchmaking (2013) (0)

This paper list is powered by the following services:

Other Resources About Yoshua Bengio

What Schools Are Affiliated With Yoshua Bengio?

Yoshua Bengio is affiliated with the following schools:

Yoshua Bengio's Academic­Influence.com Rankings

Yoshua Bengio's Degrees

Similar Degrees You Can Earn

Why Is Yoshua Bengio Influential?

Yoshua Bengio's Published Works

Published Works

Other Resources About Yoshua Bengio

What Schools Are Affiliated With Yoshua Bengio?

Image Attributions

Yoshua Bengio's AcademicInfluence.com Rankings