Jonathan Robert Glass

Jonathan Robert Glass's AcademicInfluence.com Rankings

Engineering

#3167

World Rank

#4199

Historical Rank

Electrical Engineering

#659

World Rank

#721

Historical Rank

engineering Degrees

Jonathan Robert Glass

Computer Science

#4150

World Rank

#4366

Historical Rank

Computational Linguistics

#346

World Rank

#351

Historical Rank

Database

#1384

World Rank

#1456

Historical Rank

computer-science Degrees

Download Badge

Engineering
Computer Science

Why Is Jonathan Robert Glass Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Jonathan Robert Glass's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Speech database development at MIT: Timit and beyond (1990) (569)
JUPlTER: a telephone-based conversational interface for weather information (2000) (562)
Analysis Methods in Neural Language Processing: A Survey (2018) (379)
Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams (2009) (365)
A probabilistic framework for segment-based speech recognition (2003) (335)
Unsupervised Pattern Discovery in Speech (2008) (332)
What do Neural Machine Translation Models Learn about Morphology? (2017) (331)
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data (2017) (297)
An Unsupervised Autoregressive Model for Speech Representation Learning (2019) (287)
Robust Speaker Recognition in Noisy Conditions (2007) (283)
Highway long short-term memory RNNS for distant speech recognition (2015) (279)
A Nonparametric Bayesian Approach to Acoustic Model Discovery (2012) (224)
Unsupervised Learning of Spoken Language with Visual Context (2016) (219)
AST: Audio Spectrogram Transformer (2021) (212)
Cosine Similarity Scoring without Score Normalization Techniques (2010) (209)
Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition (2014) (207)
Modelling out-of-vocabulary words for robust speech recognition (2002) (204)
Developments and directions in speech recognition and understanding, Part 1 [DSP Education] (2009) (192)
SemEval-2016 Task 3: Community Question Answering (2019) (191)
Recent progress in the MIT spoken lecture processing project (2007) (185)
A probabilistic framework for feature-based speech recognition (1996) (174)
Predicting Factuality of Reporting and Bias of News Media Sources (2018) (170)
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input (2018) (163)
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech (2018) (162)
Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach (2013) (155)
SemEval-2015 Task 3: Answer Selection in Community Question Answering (2015) (151)
Research Developments and Directions in Speech Recognition and Understanding, Part 1 (2009) (146)
Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks (2017) (138)
Learning Latent Representations for Speech Generation and Transformation (2017) (138)
Detecting Depression with Audio/Text Sequence Modeling of Interviews (2018) (136)
Heterogeneous measurements and multiple classifiers for speech recognition (1998) (132)
Modeling out-of-vocabulary words for robust speech recognition (2000) (130)
Deep multimodal semantic embeddings for speech and images (2015) (127)
Generative Pre-Training for Speech with Autoregressive Predictive Coding (2019) (125)
Identifying and Controlling Important Neurons in Neural Machine Translation (2018) (122)
Multilingual spoken-language understanding in the MIT Voyager system (1995) (121)
Analysis and Processing of Lecture Audio Data: Preliminary Investigations (2004) (120)
A segment-based audio-visual speech recognizer: data collection, development, and initial experiments (2004) (120)
What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models (2018) (119)
Exploiting Intra-Conversation Variability for Speaker Diarization (2011) (115)
Integrating Stance Detection and Fact Checking in a Unified Corpus (2018) (114)
Automatic Dialect Detection in Arabic Broadcast Speech (2015) (113)
Automatic Stance Detection Using End-to-End Memory Networks (2018) (108)
Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign (2018) (108)
GALAXY: a human-language interface to on-line travel information (1994) (107)
HETEROGENEOUS ACOUSTIC MEASUREMENTS FOR PHONETIC CLASSIFICATION (1997) (106)
The MIT SUMMIT Speech Recognition System: A Progress Report (1989) (102)
Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval (2015) (102)
Learning Word-Like Units from Joint Audio-Visual Analysis (2017) (97)
Acoustic segmentation and phonetic classification in the SUMMIT system (1988) (92)
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation (2017) (90)
Visual speech recognition with loosely synchronized feature streams (2005) (89)
Heterogeneous acoustic measurements for phonetic classification 1 (1997) (89)
Automatic processing of audio lectures for information retrieval: vocabulary selection and language modeling (2005) (89)
A complete KALDI recipe for building Arabic speech recognition systems (2014) (88)
Iterative language model estimation: efficient data structure & algorithms (2008) (88)
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos (2020) (88)
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization (2019) (87)
Multi-level acoustic segmentation of continuous speech (1988) (86)
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces (2018) (86)
Towards multi-speaker unsupervised speech pattern discovery (2010) (86)
Natural-sounding speech synthesis using variable-length units (1998) (86)
Open-Vocabulary Spoken Utterance Retrieval using Confusion Networks (2007) (85)
Unsupervised Lexicon Discovery from Acoustic Input (2015) (85)
Style & Topic Language Model Adaptation Using HMM-LDA (2006) (83)
Segmentation for English-to-Arabic Statistical Machine Translation (2008) (81)
Towards unsupervised pattern discovery in speech (2005) (81)
Real-time telephone-based speech recognition in the Jupiter domain (1999) (79)
Hidden feature models for speech recognition using dynamic Bayesian networks (2003) (78)
Lexical modeling of non-native speech for automatic speech recognition (2000) (77)
The SUMMIT speech recognition system: phonological modelling and lexical access (1990) (77)
The MGB-2 challenge: Arabic multi-dialect broadcast media recognition (2016) (75)
Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model (2018) (74)
Speechbuilder: facilitating spoken dialogue system development (2001) (74)
Statistical trajectory models for phonetic recognition (1994) (74)
Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification (2010) (73)
Resource configurable spoken query detection using Deep Boltzmann Machines (2012) (73)
Supervised and Unsupervised Transfer Learning for Question Answering (2017) (73)
Finding acoustic regularities in speech: applications to phonetic recognition (1988) (71)
SSAST: Self-Supervised Audio Spectrogram Transformer (2021) (70)
Data collection and performance evaluation of spoken dialogue systems: the MIT experience (2000) (70)
Fact Checking in Community Forums (2018) (70)
Vector-Quantized Autoregressive Predictive Coding (2020) (70)
Updated MINDS report on speech recognition and understanding, Part 2 [DSP Education] (2009) (69)
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech (2019) (66)
A voice-commandable robotic forklift working alongside humans in minimally-prepared outdoor environments (2010) (66)
14.4 A scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating (2017) (66)
PEGASUS: A spoken dialogue interface for on-line air travel planning (1994) (66)
Collecting Voices from the Cloud (2010) (65)
Integration of speech recognition and natural language processing in the MIT VOYAGER system (1991) (64)
Asgard: A portable architecture for multilingual dialogue systems (2013) (64)
Extracting deep neural network bottleneck features using low-rank matrix factorization (2014) (62)
A Low-Power Speech Recognizer and Voice Activity Detector Using Deep Neural Networks (2018) (62)
A 1020-Node Modular Microphone Array and Beamformer for Intelligent Computing Spaces (2004) (62)
Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems (2017) (62)
Query understanding enhanced by hierarchical parsing structures (2013) (62)
Arabic Diacritization with Recurrent Neural Networks (2015) (62)
VoiceID Loss: Speech Enhancement for Speaker Verification (2019) (58)
On the Use of Spectral and Iterative Methods for Speaker Diarization (2012) (57)
Towards unsupervised speech processing (2012) (56)
Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input (2007) (55)
Convolutional Neural Networks and Language Embeddings for End-to-End Dialect Recognition (2018) (55)
From interface to content: translingual access and delivery of on-line information (1997) (54)
Automatic Fact-Checking Using Context and Discourse Information (2019) (53)
An inner-product lower-bound estimate for dynamic time warping (2011) (53)
Unsupervised Word Acquisition from Speech using Pattern Discovery (2006) (52)
Learning Lexicons From Speech Using a Pronunciation Mixture Model (2013) (52)
Fast spoken query detection using lower-bound Dynamic Time Warping on Graphical Processing Units (2012) (51)
PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation (2021) (51)
Multimodal interaction with an autonomous forklift (2010) (51)
Developments and Directions in Speech Recognition and Understanding , Part 1 T (49)
Detection of nasalized vowels in American English (1985) (49)
Feature-based pronunciation modeling with trainable asynchrony probabilities (2004) (49)
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media (2019) (49)
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding (2020) (48)
A comparison of novel techniques for instantaneous speaker adaptation (1997) (48)
Multilingual language generation across multiple domains (1994) (48)
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings (2022) (48)
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies (2020) (47)
Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech (2018) (47)
Articulatory features for robust visual speech recognition (2004) (47)
A Transcription Task for Crowdsourcing with Automatic Quality Control (2011) (46)
One-shot learning of generative speech concepts (2014) (46)
Mispronunciation detection via dynamic time warping on deep belief network-based posteriorgrams (2013) (45)
Segmentation and modeling in segment-based recognition (1997) (45)
Tangled and drowned: a global review of penguin bycatch in fisheries (2017) (44)
A comparison-based approach to mispronunciation detection (2012) (44)
Confidence scoring for speech understanding systems (1998) (44)
VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems (2015) (44)
A channel-blind system for speaker verification (2011) (43)
A Deep Residual Network for Large-Scale Acoustic Scene Analysis (2019) (43)
A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models (2015) (43)
Feature-based Pronunciation Modeling for Speech Recognition (2004) (42)
A multi-class approach for modelling out-of-vocabulary words (2002) (42)
Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition (2014) (42)
A Conversational Movie Search System Based on Conditional Random Fields (2012) (42)
WHEELS: a conversational system in the automobile classifieds domain (1996) (42)
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos (2021) (42)
Joint Learning of Phonetic Units and Word Pronunciations for ASR (2013) (41)
Evaluation methodology for a telephone-based conversational system (1998) (41)
A Framework for Developing Conversational User Interfaces (2004) (41)
Telephone-based conversational speech recognition in the JUPITER domain (1998) (41)
Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation (2009) (40)
NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks (2018) (40)
We Can Detect Your Bias: Predicting the Political Ideology of News Articles (2020) (40)
On the Linguistic Representational Power of Neural Machine Translation Models (2019) (40)
The VOYAGER speech understanding system: preliminary development and evaluation (1990) (40)
A Vector Space Approach for Aspect Based Sentiment Analysis (2015) (39)
Negative Training for Neural Dialogue Response Generation (2019) (39)
Similarity Analysis of Contextual Word Representation Models (2020) (39)
Language ID-based training of multilingual stacked bottleneck features (2014) (38)
Learning units for domain-independent out-of- vocabulary word modelling (2001) (38)
Noise Robust Phonetic Classificationwith Linear Regularized Least Squares and Second-Order Features (2007) (38)
FAKTA: An Automatic End-to-End Fact Checking System (2019) (38)
An Implementation of Rational Wavelets and Filter Design for Phonetic Classification (2007) (38)
Hierarchical large-margin Gaussian mixture models for phonetic classification (2007) (37)
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition (2018) (37)
A Situationally Aware Voice‐commandable Robotic Forklift Working Alongside People in Unstructured Outdoor Environments (2015) (36)
Speech rhythm guided syllable nuclei detection (2009) (36)
A comparative study of signal representations and classification techniques for speech recognition (1993) (36)
Contrastive Language Adaptation for Cross-Lingual Stance Detection (2019) (35)
Neural Attention for Learning to Rank Questions in Community Question Answering (2016) (35)
Language processing and learning models for community question answering in Arabic (2017) (35)
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering (2016) (35)
Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data (2018) (35)
Wait-Learning: Leveraging Wait Time for Second Language Education (2015) (34)
Adversarial Domain Adaptation for Stance Detection (2019) (34)
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining (2020) (34)
Exploiting Convolutional Neural Networks for Phonotactic Based Dialect Identification (2018) (33)
PEGASUS: A Spoken Language Interface for On-Line Air Travel Planning (1994) (33)
Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation (2006) (33)
A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis (2000) (33)
On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference (2018) (33)
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition (2016) (33)
Towards Visually Grounded Sub-word Speech Unit Discovery (2019) (32)
A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects (2016) (32)
A Piecewise Aggregate Approximation Lower-Bound Estimate for Posteriorgram-Based Dynamic Time Warping (2011) (32)
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context (2020) (32)
LOUD: A 1020 node Microphone Array and Acoustic Beamformer (2007) (31)
Spoken Language Understanding for a Nutrition Dialogue System (2017) (31)
A prioritized grid long short-term memory RNN for speech recognition (2016) (31)
Towards Unsupervised Speech-to-text Translation (2018) (30)
The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech (2019) (30)
A* word network search for continuous speech recognition (1993) (30)
Multistream Articulatory Feature-Based Models for Visual Speech Recognition (2009) (30)
Spoken language biomarkers for detecting cognitive impairment (2017) (29)
Quantifying Exposure Bias for Neural Language Generation (2019) (29)
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units (2020) (29)
MIT Computer Science and Artificial Intelligence Laboratory (2015) (28)
Multimodal speech recognition with ultrasonic sensors (2007) (28)
Look, listen, and decode: Multimodal speech recognition with images (2016) (28)
Pronunciation assessment via a comparison-based system (2013) (27)
Real-time probabilistic segmentation for segment-based speech recognition (1998) (27)
Zero resource spoken audio corpus analysis (2013) (27)
Robust detection of sonorant landmarks (2005) (27)
YINHE: a Mandarin Chinese version of the GALAXY system (1997) (27)
Speaker adaptation using the i-vector technique for bottleneck features (2015) (26)
Mispronunciation detection without nonnative training data (2015) (26)
Production domain modeling of pronunciation for visual speech recognition (2005) (26)
Robust Voice Activity Detector for Real World Applications Using Harmonicity and Modulation Frequency (2011) (26)
City browser: developing a conversational automotive HMI (2009) (26)
Noise-tolerant Audio-visual Online Person Verification Using an Attention-based Neural Network Fusion (2018) (26)
Empirical acquisition of word and phrase classes in the atis domain (1993) (26)
The MIT Spoken Lecture Processing Project (2005) (25)
Everything at Once – Multi-modal Fusion Transformer for Video Retrieval (2021) (25)
Automating Crowd-supervised Learning for Spoken Language Systems (2012) (24)
A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition (2018) (24)
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning (2020) (24)
Sound Event Localization and Detection Using CRNN on Pairs of Microphones (2019) (24)
Learning modality-invariant representations for speech and images (2017) (24)
Classifying Alzheimer's Disease Using Audio and Text-Based Representations of Speech (2021) (24)
The Collection and Preliminary Analysis of a Spontaneous Speech Database (1989) (23)
Analysis of Audio-Visual Features for Unsupervised Speech Recognition (2017) (23)
Semantic mapping of natural language input to database entries via convolutional neural networks (2017) (23)
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition (2018) (22)
Recent Progress on the VOYAGER System (1990) (22)
Exploiting Context Information in Spoken Dialogue Interaction with Mobile Devices ? (22)
Mokusei: a telephone-based Japanese conversational system in the weather domain (2001) (22)
Scalable Factorized Hierarchical Variational Autoencoder Training (2018) (21)
A Bilingual VOYAGER System (1993) (21)
Flexible and Personalizable Mixed-Initiative Dialogue Systems (2003) (21)
Transfer Learning from Audio-Visual Grounding to Speech Recognition (2019) (21)
Subword Regularization and Beam Search Decoding for End-to-end Automatic Speech Recognition (2019) (21)
Detecting egregious responses in neural sequence-to-sequence models (2018) (21)
Data collection and language understanding of food descriptions (2014) (21)
The VOYAGER Speech Understanding System: A Progress Report (1989) (20)
Combining End-to-End and Adversarial Training for Low-Resource Speech Recognition (2018) (20)
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages (2014) (20)
Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique (2003) (20)
Learning Word Embeddings from Speech (2017) (20)
Similarity Analysis of Self-Supervised Speech Representations (2020) (20)
Grounding Spoken Words in Unlabeled Video (2019) (20)
ADI17: A Fine-Grained Arabic Dialect Identification Dataset (2020) (20)
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition (2021) (20)
Multilingual human-computer interactions: from information access to language learning (1996) (19)
Historical Development and Future Directions in Speech Recognition and Understanding (2007) (19)
Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera (2014) (19)
Learning Semantic Relatedness in Community Question Answering Using Neural Models (2016) (19)
Development and Preliminary Evaluation of the MIT ATIS System (1991) (19)
Heterogeneous lexical units for automatic speech recognition: preliminary investigations (2000) (19)
A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation (2020) (19)
Trilingual Semantic Embeddings of Visually Grounded Speech with Self-Attention Mechanisms (2020) (19)
MCE 2018: The 1st Multi-target Speaker Detection and Identification Challenge Evaluation (MCE) Plan, Dataset and Baseline System (2018) (19)
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models (2019) (19)
Vowel classification based on analysis-by-synthesis (1992) (19)
Morphing spectral envelopes using audio flow (2005) (18)
Pronunciation Learning from Continuous Speech (2011) (18)
Automatic speech recognition of Arabic multi-genre broadcast media (2017) (18)
The MIT ATIS System: February 1992 Progress Report (1992) (18)
Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition (2019) (18)
MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge (2017) (18)
SVD-PHAT: A Fast Sound Source Localization Method (2018) (18)
Information-theoretic criteria for unit selection synthesis (2002) (17)
A wavelet and filter bank framework for phonetic classification (2005) (17)
Distributional semantics for understanding spoken meal descriptions (2016) (17)
Multilingual data selection for training stacked bottleneck features (2016) (17)
Cross-Modal Discrete Representation Learning (2021) (16)
Context-dependent pronunciation error pattern discovery with limited annotations (2014) (16)
Recent Progress on the SUMMIT System (1990) (16)
Wait-learning: leveraging conversational dead time for second language education (2014) (16)
Personalized mispronunciation detection and diagnosis based on unsupervised error pattern discovery (2016) (16)
Empirical acquisition of language models for speech recognition (1994) (16)
Baum-Welch training for segment-based speech recognition (2003) (16)
PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation (2021) (16)
Growing a Spoken Language Interface on Amazon Mechanical Turk (2011) (16)
Discriminative training of hierarchical acoustic models for large vocabulary continuous speech recognition (2009) (15)
On the phonetic information in ultrasonic microphone signals (2009) (15)
Modelling spectral dynamics for vowel classification (1993) (15)
Tanbih: Get To Know What You Are Reading (2019) (15)
Detection and recognition of nasal consonants in American English (1986) (15)
QMDIS: QCRI-MIT Advanced Dialect Identification System (2017) (15)
Neural Multi-Task Learning for Stance Prediction (2019) (15)
Telephone data collection using the World Wide Web (1996) (15)
On Training Recurrent Networks with Truncated Backpropagation Through time in Speech Recognition (2018) (15)
Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge (2016) (15)
Segment-based recognition on the phonebook task: initial results and observations on duration modeling (2001) (15)
Spoken command of large mobile robots in outdoor environments (2010) (15)
Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection (2019) (14)
A Multimodal Home Entertainment Interface via a Mobile Device (2008) (14)
Recent progress on the MIT VOYAGER spoken language system (1990) (14)
Speech recognition with localized time-frequency pattern detectors (2007) (14)
On using heterogeneous data for vehicle-based speech recognition: A DNN-based approach (2015) (14)
Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio (2019) (14)
Recurrent Neural Network Encoder with Attention for Community Question Answering (2016) (13)
A Factorial Deep Markov Model for Unsupervised Disentangled Representation Learning from Speech (2019) (13)
Automatic learning of lexical representations for sub-word unit based speech recognition systems (1991) (13)
N-gram Weighting: Reducing Training Data Mismatch in Cross-Domain Language Model Estimation (2008) (12)
Large-Scale Machine Translation between Arabic and Hebrew: Available Corpora and Initial Results (2016) (12)
Learning new word pronunciations from spoken examples (2010) (12)
Interpretable Propaganda Detection in News Articles (2021) (12)
Detection and classification of phonemes using context-independent error back-propagation (1990) (12)
Unsupervised Representation Learning of Speech for Dialect Identification (2018) (11)
A NOVEL DTW-BASED DISTANCE MEASURE FOR SPEAKER SEGMENTATION (2006) (11)
On the Linguistic Representational Power of Neural Machine Translation Models (2020) (11)
New word acquisition using subword modeling (2007) (11)
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification (2017) (11)
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning (2020) (11)
Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions (2017) (11)
Identification of digital voice biomarkers for cognitive health. (2020) (11)
Preliminary Evaluation of the VOYAGER Spoken Language System (1989) (11)
An Efferent-Inspired Auditory Model Front-End for Speech Recognition (2011) (11)
Deep Learning for Database Mapping and Asking Clarification Questions in Dialogue Systems (2019) (10)
A back-off discriminative acoustic model for automatic speech recognition (2009) (10)
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification (2022) (10)
Modelling Context Dependency in Acoustic-Phonetic and Lexical Representations (1991) (10)
Flexible Multi-Stream Framework for Speech Recognition using Multi-Tape Finite-State Transducers (2006) (10)
Speech recognition without a lexicon - bridging the gap between graphemic and phonetic systems (2014) (10)
Sentence Detection Using Multiple Annotations (2012) (10)
Prediction-adaptation-correction recurrent neural networks for low-resource language speech recognition (2015) (10)
Speaker Verification Over Handheld Devices with Realistic Noisy Speech Data (2006) (10)
A Comparison of Deep Learning Methods for Language Understanding (2019) (10)
Probabilistic Dialogue Modeling for Speech-Enabled Assistive Technology (2013) (10)
A Comparative Study of Methods for Handheld Speaker Verification in Realistic Noisy Conditions (2006) (10)
Multiple Sound Source Localization with SVD-PHAT (2019) (9)
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models (2019) (9)
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment (2022) (9)
Bayesian distance metric learning on i-vector for speaker verification (2013) (9)
Learning Words by Drawing Images (2019) (9)
Prototypical Q Networks for Automatic Conversational Diagnosis and Few-Shot New Disease Adaption (2020) (8)
Spoken language systems for human/machine interfaces (1991) (8)
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-Level Cross-Lingual Speech Representation (2022) (8)
Convolutional Neural Networks and Multitask Strategies for Semantic Mapping of Natural Language Input to a Structured Database (2018) (8)
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future (2019) (8)
Bidirectional Backpropagation: Towards Biologically Plausible Error Signal Transmission in Neural Networks (2017) (8)
Limited labels for unlimited data: active learning for speaker recognition (2014) (8)
On the Use of Acoustic Unit Discovery for Language Recognition (2016) (8)
Language model parameter estimation using user transcriptions (2009) (7)
Learning a Subword Inventory Jointly with End-to-End Automatic Speech Recognition (2020) (7)
Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders (2016) (7)
Magic Dust for Cross-Lingual Adaptation of Monolingual Wav2vec-2.0 (2021) (7)
DARTS: Dialectal Arabic Transcription System (2019) (7)
27.2 A 6mW 5K-Word real-time speech recognizer using WFST models (2014) (7)
An Environmental Feature Representation for Robust Speech Recognition and for Environment Identification (2017) (7)
Preliminary ATIS Development at MIT (1990) (7)
Convolutional Neural Networks for Dialogue State Tracking without Pre-Trained Word Vectors or Semantic Dictionaries (2018) (7)
Lexical modeling for Arabic ASR: a systematic approach (2014) (6)
Pair Expansion for Learning Multilingual Semantic Embeddings Using Disjoint Visually-Grounded Speech Audio Datasets (2020) (6)
Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation? (2019) (6)
Learning Word Representations with Cross-Sentence Dependency for End-to-End Co-reference Resolution (2018) (6)
A collective data generation method for speech language models (2010) (6)
Contrastive Audio-Visual Masked Autoencoder (2022) (6)
Mix-review: Alleviate Forgetting in the Pretrain-Finetune Framework for Neural Language Generation Models (2019) (6)
Dialogue State Tracking with Convolutional Semantic Taggers (2019) (6)
Mitigating Biases in Toxic Language Detection through Invariant Rationalization (2021) (6)
Simple and Effective Unsupervised Speech Synthesis (2022) (6)
Multimodal Association for Speaker Verification (2020) (5)
Fast and Robust 3-D Sound Source Localization with DSVD-PHAT (2019) (5)
What Does an End-to-End Dialect Identification Model Learn About Non-Dialectal Information? (2020) (5)
A Study of using Syntactic and Semantic Structures for Concept Segmentation and Labeling (2014) (5)
Cascaded Multilingual Audio-Visual Learning from Videos (2021) (5)
Automatic lexical pronunciations generation and update (2007) (5)
Using linguistic knowledge in statistical machine translation (2010) (5)
A Food Logging System for iOS with Natural Spoken Language Meal Descriptions (P21-009-19). (2019) (5)
A turbo-style algorithm for lexical baseforms estimation (2008) (5)
The MIT ATIS system; preliminary development, spontaneous speech data collection, and performance evaluation (1991) (5)
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition (2022) (5)
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis (2021) (5)
Segment-based apparatus and method for speech recognition by analyzing multiple speech unit frames and modeling both temporal and spatial correlation (1997) (5)
An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models (2021) (5)
Audio-Visual Calibration with Polynomial Regression for 2-D Projection Using SVD-PHAT (2020) (5)
Porting the bilingual voyager system to Italian (1994) (5)
A Study of the Complexity and Accuracy of Direction of Arrival Estimation Methods Based on GCC-PHAT for a Pair of Close Microphones (2018) (4)
Collection and Analyses of WSJ-CSR Data at MIT (1992) (4)
Cooperative Learning of Zero-Shot Machine Reading Comprehension (2021) (4)
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation (2022) (4)
Statistical trajectory models for phonetic classification (1994) (4)
Collection and analyses of WSJ-CSR corpus at MIT (1992) (4)
Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain (2018) (4)
Constructing a Knowledge Graph from Unstructured Documents without External Alignment (2020) (4)
Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering (2019) (4)
Knowledge Grounded Conversational Symptom Detection with Graph Memory Networks (2020) (4)
An Evaluation of Age, Gender, and Technology Experience in User Performance and Impressions of a Multimodal Human-Machine Interface (2011) (4)
Phonetic transition modeling for continuous speech recognition (1994) (4)
CLAC: A Speech Corpus of Healthy English Speakers (2021) (3)
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset (2021) (3)
on Speech Recognition and Understanding , Part 2 (2009) (3)
Extending the galaxy communicator architecture for multimodal interaction research (2002) (3)
Role-specific Language Models for Processing Recorded Neuropsychological Exams (2018) (3)
Explicit Alignment of Text and Speech Encodings for Attention-Based End-to-End Speech Recognition (2019) (3)
Multi-level context-dependent acoustic modeling for automatic speech recognition (2011) (3)
MOKUSEI:A Japanese Spoken Dialogue System in the Weather Domain (特集論文 NTT-MIT共同研究) (2000) (3)
Guest editorial introduction to the special issue on language modeling and dialogue systems (2000) (3)
Routing with Self-Attention for Multimodal Capsule Networks (2021) (2)
NOVEL DIGITAL VOICE BIOMARKERS OF DEMENTIA FROM THE FRAMINGHAM STUDY (2018) (2)
AutoKG: Constructing Virtual Knowledge Graphs from Unstructured Documents for Question Answering (2021) (2)
Robust Speaker Recognition in Unknown Noisy Conditions (2005) (2)
Energy-Efficient Speaker Identification with Low-Precision Networks (2018) (2)
NUTRITION SYSTEM DEMONSTRATION (2014) (2)
From Speech Recognition to Spoken Language Understanding (1990) (2)
Language Modeling with Graph Temporal Convolutional Networks (2018) (2)
Phonetic Classification and Recognition Using the Multi-Layer Perceptron (1990) (2)
Crossmodal Search using Visually Grounded Multilingual Speech Signal (2019) (1)
Unsupervised Methods for Evaluating Speech Representations (2020) (1)
From Speech Recognition to Spoken Language Understanding: The Development of the MIT SUMMIT and VOYAGER Systems (1990) (1)
Multistream Articulatory Feature-Based Models for Visual (2009) (1)
Detecting Dementia from Long Neuropsychological Interviews (2022) (1)
MUSE: a scripting language for the development of interactive speech analysis and recognition tools (1997) (1)
Testing the Validity of a Natural Spoken Language Application for the Self-Monitoring of Daily Dietary Intake (P13-035-19). (2019) (1)
VOWEL CLASSIFICATION BASED ON ANALYSIS-BY-SYNTHESIS 1 (1992) (1)
A systematic review of the characteristics of adolescents with major depressive disorder in randomised controlled treatment trials (2021) (1)
A FRAMEWORK FOR DEVELOPING CONVER- SATIONAL USER INTERFACES (2005) (1)
Acoustic characteristics of nasal consonants in American English (1984) (1)
A Noise-Robust Self-Adaptive Multitarget Speaker Detection System (2018) (1)
Introduction to the Issue on Speech Processing for Natural Interaction With Intelligent Environments (2010) (1)
Crossmodal Search using Visually Grounded Multilingual Speech Signal (2019) (1)
Handling uncertain observations in unsupervised topic-mixture language model adaptation (2012) (1)
Spoken Correction for Chinese Text Entry (2006) (1)
A Conversational Sys-tem in the Automobile Classi eds Domain (1996) (1)
AUTOMATED VISCOMETER SAVES LABORATORY MANPOWER (1976) (0)
CONVOLUTIONAL NEURAL NETWORKS AND MULTITASK STRATEGIES FOR SEMANTIC MAPPING OF NATURAL LANGUAGE INPUT TO A STRUCTURED DATABASE - 0006174 (2018) (0)
Building Japanese Conversational Systems based on the Galaxy Architecture (2004) (0)
Shape Recognizer Gesture Recognizer Robot Canvas Information Shape Gesture Stroke (2010) (0)
Edinburgh Research Explorer The MGB-2 Challenge: Arabic Multi-Device Broadcast Media Recognition (2016) (0)
Defensive biting by Tetragonisca angustula is dangerous but not suicidal (2020) (0)
Repetition Assessment for Speech and Language Disorders: A Study of the Logopenic Variant of Primary Progressive Aphasia (2022) (0)
Pulse synchronous analysis of speech using an auditory representation (1987) (0)
Comparison of Energy Intake Determined by a Natural Spoken Language Application with 24-h Recall (2020) (0)
A Study of Speech Interfaces for the Vehicle Environment (2013) (0)
Unrecognised performances (2021) (0)
Short Paper: Use of Natural Spoken Language with Automated Mapping of Self-Reported Food Intake to Food Composition Data for Low-Burden Real-Time Dietary Assessment (Preprint) (2021) (0)
PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation (2022) (0)
Spoken Moments: A Large Scale Dataset of Audio Descriptions of Dynamic Events in Video (2020) (0)
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts (2006) (0)
Spoken Dialog Planning to Reduce User Distraction in Mobile Environments (2015) (0)
Explorer The MGB-2 Challenge : Arabic Multi-Device Broadcast Media (2018) (0)
Talking To Your Database: Interactive Spoken Language Interfaces (1991) (0)
Evaluation of a Natural Speech Based Informational Inquiry System as a Potential Means to Increase Transit Utilization (2013) (0)
Method for extracting metals from ores (1980) (0)
Crowd-supervised Learning for Spoken Language Systems (2012) (0)
On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration (2022) (0)
Modelling Graph-Based Observation Spaces for Segment-Based Speech Recognition (2004) (0)
Reincarnation; Or, How Bertolt Brecht Recreated Frank Wedekind (2015) (0)
Winter resolve (2022) (0)
Mélanges, procédés et compositions concernant des matériaux conducteurs (2013) (0)
Evaluation of multi-level context-dependent acoustic model for large vocabulary speaker adaptation tasks (2012) (0)
SEGMENTATION AND MODELING IN SEGMENT-BASED RECOGNITION 1 (2021) (0)
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval (2022) (0)
DATA COLLECTION AND PERFORMANCE EVALUATION OF SPOKEN DIALOGUE SYSTEMS: THE MIT EXPERIENCE 1 (2021) (0)
Interpretable Unified Language Checking (2023) (0)
Procédé et dispositif d’insertion de palplanches dans des sols très résistants (2005) (0)
On The Inductive Bias of Words in Acoustics-to-Word Models (2018) (0)
Controlling the Focus of Pretrained Language Generation Models (2022) (0)
Speak: A Toolkit Using Amazon Mechanical Turk to Collect and Validate Speech Audio Recordings (2022) (0)
Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection (2021) (0)
Two New Corpora for Audio-Visual Speech Processing (2004) (0)
LANGUAGE MODEL PARAMETER ESTIMATION USING USER (2009) (0)
JUPITER Data Collection and Analysis (1999) (0)
Association Between Acoustic Features and Neuropsychological Test Performance in the Framingham Heart Study: Observational Study (2022) (0)
C L ] 1 1 A ug 2 01 6 Automatic Dialect Detection in Arabic Broadcast Speech (2018) (0)
Cooperative Self-training of Machine Reading Comprehension (2021) (0)
Use of Natural Spoken Language With Automated Mapping of Self-reported Food Intake to Food Composition Data for Low-Burden Real-time Dietary Assessment: Method Comparison Study. (2021) (0)
Growing ObjectNet: Adding speech, VQA, occlusion, and measuring dataset difficulty (2022) (0)
MIT Open Access Articles Wait-Learning: Leveraging Wait Time for Second Language Education (2022) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Jonathan Robert Glass?

Jonathan Robert Glass is affiliated with the following schools:

Massachusetts Institute of Technology

Jonathan Robert Glass's Academic­Influence.com Rankings

Why Is Jonathan Robert Glass Influential?

Jonathan Robert Glass's Published Works

Published Works

What Schools Are Affiliated With Jonathan Robert Glass?

Jonathan Robert Glass's AcademicInfluence.com Rankings