Satoshi Nakamura

Q: What Schools Are Affiliated With Satoshi Nakamura

Satoshi Nakamura is affiliated with the following schools: Nara Institute of Science and Technology, Meiji University, Toyohashi University of Technology, Riken, Kyoto University, Ritsumeikan University

Satoshi Nakamura's AcademicInfluence.com Rankings

Satoshi Nakamura

Computer Science

#5898

World Rank

#6222

Historical Rank

Database

#3028

World Rank

#3157

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Why Is Satoshi Nakamura Influential?

(Suggest an Edit or Addition)

According to Wikipedia, is a professor at the Graduate School of Information Science, Nara Institute of Science and Technology, Japan. He is also an honorary professor at Karlsruhe Institute of Technology, Germany. Nakamura's current research interests include speech-to-speech translation, speech recognition, speech synthesis, spoken dialog systems, multi-modal communication, and brain activity sensing in linguistics.

(See a Problem?)

Satoshi Nakamura's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Voice conversion through vector quantization (1988) (633)
Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition (2000) (262)
Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T) (2015) (229)
Can social bookmarking enhance search in the web? (2007) (221)
Incorporating Discrete Translation Lexicons into Neural Machine Translation (2016) (195)
Listening while speaking: Speech chain by deep learning (2017) (139)
Lip movement synthesis from speech based on hidden Markov models (1998) (136)
Speech enhancement based on the subspace method (2000) (120)
Localization of multiple sound sources based on a CSP analysis with a microphone array (2000) (113)
The ATR Multilingual Speech-to-Speech Translation System (2006) (108)
Guiding Neural Machine Translation with Retrieved Translation Pieces (2018) (103)
Compressing recurrent neural network with tensor train (2017) (94)
Embodied conversational agents for multimodal automated social skills training in people with autism spectrum disorders (2017) (88)
Automatic pronunciation scoring of words and sentences independent from the non-native's first language (2009) (82)
Make Skeleton-based Action Recognition Model Smaller, Faster and Better (2019) (80)
Trustworthiness Analysis of Web Search Results (2007) (78)
Statistical singing voice conversion with direct waveform modification based on the spectrum differential (2014) (76)
A postfilter to modify the modulation spectrum in HMM-based speech synthesis (2014) (76)
AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition (2005) (76)
Optimizing Segmentation Strategies for Simultaneous Speech Translation (2014) (70)
Eliciting Positive Emotion through Affect-Sensitive Dialogue Response Generation: A Neural Network Approach (2018) (69)
Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015 (2015) (66)
Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis (2016) (62)
Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics (2007) (59)
Cepstrum derived from differentiated power spectrum for robust speech recognition (2003) (58)
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019 (2019) (58)
Machine Speech Chain with One-shot Speaker Adaptation (2018) (54)
Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework (2006) (53)
Automatic Generation of Non-uniform HMM Topologies Based on the MDL Criterion (2004) (52)
Detecting Dementia Through Interactive Computer Avatars (2017) (50)
Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project (2008) (49)
Statistical multimodal integration for audio-visual speech processing (2002) (48)
Stream weight optimization of speech and lip image sequence for audio-visual speech recognition (2000) (47)
Feature optimized DPGMM clustering for unsupervised subword modeling: A contribution to zerospeech 2017 (2017) (47)
Multi-Source Neural Machine Translation with Missing Data (2018) (47)
Robust speech recognition with speaker localization by a microphone array (1996) (47)
Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system (2012) (44)
Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents (2004) (43)
Automated Social Skills Trainer (2015) (43)
Efficient representation of short-time phase based on group delay (1998) (42)
ReMOT: A model-agnostic refinement for multiple object tracking (2020) (42)
Robust fundamental frequency estimation using instantaneous frequencies of harmonic components (2000) (41)
Simple, lexicalized choice of translation timing for simultaneous speech translation (2013) (41)
Never-ending learning system for on-line speaker diarization (2007) (41)
Statistical dialog management applied to WFST-based dialog systems (2009) (41)
Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration (2015) (40)
Learning, Generation and Recognition of Motions by Reference-Point-Dependent Probabilistic Models (2011) (40)
Local Monotonic Attention Mechanism for End-to-End Speech And Language Processing (2017) (40)
Sound scene data collection in real acoustical environments (1999) (40)
Tensor Decomposition for Compressing Recurrent Neural Network (2018) (39)
Generation of views of TV content using TV viewers' perspectives expressed in live chats on the web (2005) (39)
On the Importance of Pivot Language Selection for Statistical Machine Translation (2009) (39)
Developing Non-goal Dialog System Based on Examples of Drama Television (2012) (38)
HMM-separation-based speech recognition for a distant moving speaker (2001) (37)
ATR HMM-LR continuous speech recognition system (1990) (36)
CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments (2009) (36)
Adaptive Wavenet Vocoder for Residual Compensation in GAN-Based Voice Conversion (2018) (36)
An HMM-based Vietnamese speech synthesis system (2009) (36)
Modulation spectrum-constrained trajectory training algorithm for GMM-based Voice Conversion (2015) (35)
End-to-end Feedback Loss in Speech Chain Framework via Straight-through Estimator (2018) (35)
Unsupervised Linear Discriminant Analysis for Supporting DPGMM Clustering in the Zero Resource Scenario (2016) (35)
DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks (2019) (35)
A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation (2014) (34)
Structured-Based Curriculum Learning for End-to-End English-Japanese Speech Translation (2017) (34)
A Robust Speech Recognition System for Communication Robots in Noisy Environments (2008) (34)
CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition (2010) (34)
Speaker adaptation applied to HMM and neural networks (1989) (32)
Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition (1999) (32)
The NAIST Text-to-Speech System for the Blizzard Challenge 2015 (2015) (31)
DATA COLLECTION AND EVALUATION OF AURORA-2 JAPANESE CORPUS (2003) (31)
Parameter Generation Methods With Rich Context Models for High-Quality and Flexible Text-To-Speech Synthesis (2014) (31)
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents (2015) (31)
Preserving Word-Level Emphasis in Speech-to-Speech Translation (2017) (31)
Multichannel Bin-Wise Robust Frequency-Domain Adaptive Filtering and Its Application to Adaptive Beamforming (2007) (30)
Multilingual Speech-to-Speech Translation System: VoiceTra (2013) (30)
Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential (2018) (30)
Statistical singing voice conversion based on direct waveform modification with global variance (2015) (30)
Collection of a Simultaneous Translation Corpus for Comparative Analysis (2014) (29)
Utilizing Human-to-Human Conversation Examples for a Multi Domain Chat-Oriented Dialog System (2014) (29)
ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles (2006) (29)
Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance (2007) (29)
CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments (2006) (28)
CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments (2008) (28)
Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing (2014) (28)
Speech-to-Speech Translation Between Untranscribed Unknown Languages (2019) (28)
Noise adaptive speech recognition based on sequential noise parameter estimation (2004) (28)
Fusion of Audio-Visual Information for Integrated Speech Processing (2001) (28)
Indonesian speech recognition for hearing and speaking impaired people (2004) (28)
A comparative study of spectral mapping for speaker adaptation (1990) (27)
Joint optimization of LCMV beamforming and acoustic echo cancellation for automatic speech recognition (2005) (27)
Positive Emotion Elicitation in Chat-Based Dialogue Systems (2019) (27)
Can social annotation support users in evaluating the trustworthiness of video clips? (2008) (27)
Transformer-Based Direct Speech-To-Speech Translation with Transcoder (2021) (26)
Towards Improving Web Search by Utilizing Social Bookmarks (2007) (26)
Speech to lip movement synthesis by HMM (1997) (26)
Dialog management using weighted finite-state transducers (2008) (26)
Acquiring a Dictionary of Emotion-Provoking Events (2014) (25)
Sub-band based additive noise removal for robust speech recognition (2001) (25)
Sequential Non-Stationary Noise Tracking Using Particle Filtering with Switching Dynamical System (2006) (25)
Efficient speech transcription through respeaking (2013) (25)
ReMOTS: Self-Supervised Refining Multi-Object Tracking and Segmentation (2020) (24)
Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge (2020) (24)
Robust Speech Recognition System for Communication Robots in Real Environments (2006) (24)
Machine Speech Chain (2020) (24)
Joint optimization of LCMV beamforming and acoustic echo cancellation (2004) (24)
The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016 (2016) (24)
EEG signal enhancement using multi-channel wiener filter with a spatial correlation prior (2015) (24)
RWCP Sound Scene Database in Real Acoustic Environment (2002) (24)
Missing Feature Theory Applied to Robust Speech Recognition over IP Network (2003) (24)
CENSREC2: corpus and evaluation environments for in car continuous digit speech recognition (2005) (23)
Improved novelty detection for online GMM based speaker diarization (2008) (23)
NICT/ATR Chinese-Japanese-English Speech-to-Speech Translation System (2008) (23)
Optimal acoustic and language model weights for minimizing word verification errors (2004) (23)
Teaching Social Communication Skills Through Human-Agent Interaction (2016) (23)
Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array (2002) (23)
Robust speech recognition in car environments (1998) (22)
Development of HMM-based Indonesian Speech Synthesis (2008) (22)
An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation (2017) (21)
Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion (2003) (21)
Particle filter based non-stationary noise tracking for robust speech recognition (2005) (21)
A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion (2013) (21)
Classification of alkaloids according to the starting substances of their biosynthetic pathways using graph convolutional neural networks (2019) (21)
Another Diversity-Promoting Objective Function for Neural Dialogue Generation (2018) (21)
Journey to the past: proposal of a framework for past web browser (2006) (20)
Temporal filtering system to reduce the risk of spoiling a user's enjoyment (2007) (20)
Audio-visual speech translation with automatic lip syncqronization and face tracking based on 3-D head model (2002) (20)
Linguistic and Acoustic Features for Automatic Identification of Autism Spectrum Disorders in Children’s Narrative (2014) (20)
Improving Neural Machine Translation through Phrase-based Forced Decoding (2017) (20)
Rerank-by-Example: Efficient Browsing of Web Search Results (2007) (20)
Multilingual Mobile-Phone Translation Services for World Travelers (2008) (19)
Cultural Communication Idiosyncrasies in Human-Computer Interaction (2016) (19)
Microphone array design measures for hands-free speech recognition (1997) (19)
Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering (2016) (19)
Attention-based Wav2Text with feature transfer learning (2017) (19)
Noise and room acoustics distorted speech recognition by HMM composition (1996) (19)
An evaluation of sound source identification with RWCP sound scene database in real acoustic environments (2002) (19)
A method for translation of paralinguistic information (2012) (18)
Sequence-to-Sequence Asr Optimization Via Reinforcement Learning (2017) (18)
Voice Timbre Control Based on Perceived Age in Singing Voice Conversion (2014) (18)
Application of a double-talk resilient DFT domain adaptive filter for bin-wise stepsize controls to adaptive beamforming (2005) (18)
Hands-free speech recognition based on 3-D Viterbi search using a microphone array (1998) (18)
Modulation spectrum-based post-filter for GMM-based Voice Conversion (2014) (18)
Speech Chain for Semi-Supervised Learning of Japanese-English Code-Switching ASR and TTS (2018) (18)
A Simplified Õ(nm) Time Edge-Splitting Algorithm in Undirected Graphs (2000) (18)
A statistical lexicon for non-native speech recognition (2004) (18)
A Non-stationary Noise Suppression Method Based on Particle Filtering and Polyak Averaging (2006) (17)
Gated Recurrent Neural Tensor Network (2016) (17)
End-to-End Speech Translation With Transcoding by Multi-Task Learning for Distant Language Pairs (2020) (17)
Annotating Dialogue Acts to Construct Dialogue Systems for Consulting (2009) (17)
ATRECSS — ATR ENGLISH SPEECH CORPUS FOR SPEECH SYNTHESIS (2007) (17)
Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series - (2008) (17)
Neural Machine Translation via Binary Code Prediction (2017) (17)
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR (2015) (17)
Situated Spoken Dialogue with Robots Using Active Learning (2011) (17)
Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching (2005) (17)
F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential (2016) (17)
Recent advances in WFST-based dialog system (2009) (16)
Hybrid HMM/BN ASR system integrating spectrum and articulatory features (2003) (16)
Multi-Source Neural Machine Translation with Data Augmentation (2018) (16)
Generalizing continuous-space translation of paralinguistic information (2013) (16)
Search intent estimation from user's eye movements for supporting information seeking (2012) (16)
Ckylark: A More Robust PCFG-LA Parser (2015) (16)
Musical-noise-free blind speech extraction integrating microphone array and iterative spectral subtraction (2014) (16)
Multi-modal temporal asynchronicity modeling by product HMMs for robust audio-visual speech recognition (2002) (15)
Rapid Development of Initial Indonesian Phoneme-based Speech Recognition Using The Cross-Language Approach (2005) (15)
Neural iTTS: Toward Synthesizing Speech in Real-time with End-to-end Neural Text-to-Speech Framework (2019) (15)
The Asian network-based speech-to-speech translation system (2009) (15)
Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments (2011) (15)
Face-to-Talk: Audio-Visual Speech Detection for Robust Speech Recognition in Noisy Environment (2003) (15)
Speed or accuracy? a study in evaluation of simultaneous speech translation (2015) (15)
Analyzing the Effect of Entrainment on Dialogue Acts (2016) (15)
Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy (2010) (15)
Never-ending learning with dynamic hidden Markov network (2007) (15)
Active learning of confidence measure function in robot language acquisition framework (2010) (15)
Selecting Syntactic, Non-redundant Segments in Active Learning for Machine Translation (2016) (15)
Pseudogen: A Tool to Automatically Generate Pseudo-Code from Source Code (2015) (15)
Environmental sound source identification based on hidden Markov model for robust speech recognition (2003) (15)
Neural Network Approaches to Dialog Response Retrieval and Generation (2016) (15)
Constructing a speech translation system using simultaneous interpretation data (2013) (14)
Parameter generation algorithm considering Modulation Spectrum for HMM-based speech synthesis (2015) (14)
Synchronization between overt speech envelope and EEG oscillations during imagined speech (2020) (14)
Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis (2014) (14)
Out-of-domain detection based on confidence measures from multiple topic classification (2004) (14)
Simultaneous recognition of multiple sound sources based on 3-d n-best search using microphone array (1999) (14)
Unsupervised Phoneme Segmentation of Previously Unseen Languages (2016) (14)
Maximum likelihood sub-band adaptation for robust speech recognition (2005) (14)
Learning a Lexicon and Translation Model from Phoneme Lattices (2016) (14)
Detection of Dementia from Responses to Atypical Questions Asked by Embodied Conversational Agents (2018) (14)
Topic classification and verification modeling for out-of-domain utterance detection (2004) (14)
Weighted finite state transducer based statistical dialog management (2009) (14)
Interactive Image Manipulation with Natural Language Instruction Commands (2018) (13)
Talker localization in a real acoustic environment based on DOA estimation and statistical sound source identification (2002) (13)
Annotating communicative function and semantic content in dialogue act for construction of consulting dialogue systems (2009) (13)
Model adaptation based on HMM decomposition for reverberant speech recognition (1997) (13)
Generalized posterior probability for minimizing verification errors at subword, word and sentence levels (2004) (13)
Detection and separation of speech segment using audio and video information fusion (2003) (13)
Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition (2010) (13)
A browser for browsing the past web (2006) (13)
A-STAR: Toward translating Asian spoken languages (2013) (13)
Collaborative ambient systems by blow displays (2007) (13)
Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model (2020) (13)
Iterative training of a DPGMM-HMM acoustic unit recognizer in a zero resource scenario (2016) (13)
Improved bimodal speech recognition using tied-mixture HMMs and 5000 word audio-visual synchronous database (1997) (13)
Linguistic Individuality Transformation for Spoken Language (2015) (13)
Multi-Scale Alignment and Contextual History for Attention Mechanism in Sequence-to-Sequence Model (2018) (12)
A neural speaker model for speaker clustering (1991) (12)
Toward Construction of Spoken Dialogue System that Evokes Users’ Spontaneous Backchannels (2011) (12)
An Incremental Turn-Taking Model For Task-Oriented Dialog Systems (2019) (12)
Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding (2002) (12)
Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition (2008) (12)
Building a free, general-domain paraphrase database for Japanese (2014) (12)
Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition (2019) (12)
Speech-to-Lip Movement Synthesis by Maximizing Audio-Visual Joint Probability Based on the EM Algorithm (1998) (12)
Sequential noise compensation by a sequential kullback proximal algorithm (2001) (12)
HMM-based noise-robust feature compensation (2006) (12)
Spoken Dialogue Systems Technology and Design - International Workshop on Spoken Dialogue Systems Technology, IWSDS 2009, Kloster Irsee, Germany, December 9-11, 2009 (2014) (12)
Bayesian learning of confidence measure function for generation of utterances and motions in object manipulation dialogue task (2009) (12)
A bootstrapping approach for SLU portability to a new language by inducting unannotated user queries (2012) (12)
Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS (2020) (11)
Transcribing against time (2017) (11)
Improving Search and Information Credibility Analysis from Interaction between Web1.0 and Web2.0 Content (2010) (11)
NICT-ATR Speech-to-Speech Translation System (2007) (11)
Post-Filters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis (2016) (11)
Non-verbal cognitive skills and autistic conditions: An analysis and training tool (2012) (11)
Modeling spoken decision support dialogue and optimization of its dialogue strategy (2011) (11)
Automatic generation of non-uniform HMM structures based on variational Bayesian approach (2004) (11)
Analysis on Effects of Text-to-Speech and Avatar Agent in Evoking Users' Spontaneous Listener's Reactions (2011) (11)
Discriminative training of HMM using maximum normalized likelihood algorithm (2001) (11)
A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments (1997) (11)
Development of client-server speech translation system on a multi-lingual speech communication platform (2006) (11)
Electroencephalogram-Based Single-Trial Detection of Language Expectation Violations in Listening to Speech (2019) (11)
Emotion recognition on Indonesian television talk shows (2014) (11)
A Framework for Knowing Who is Doing What in Aerial Surveillance Videos (2019) (10)
Incorporating Knowledge Sources into Statistical Speech Recognition (2009) (10)
Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags (2018) (10)
Compressing End-to-end ASR Networks by Tensor-Train Decomposition (2018) (10)
Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing (2018) (10)
Sound Scene Database in Real Acoustical Environments, Proc. First International Workshop on East-Asian Language Resource and Evaluation (1998) (10)
Supporting Sharing of Browsing Information and Search Results in Mobile Collaborative Searches (2011) (10)
Analysis of conversational listening skills toward agent-based social skills training (2019) (10)
Study of environmental sound source identification based on hidden Markov model for robust speech recognition (2003) (10)
HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpus (2002) (10)
Active Learning for Example-Based Dialog Systems (2016) (10)
CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework (2005) (10)
NOCOA+: Multimodal Computer-Based Training for Social and Communication Skills (2015) (10)
Regression approaches to perceptual age control in singing voice conversion (2014) (10)
Dialogue State Tracking using Long Short Term Memory Neural Networks (2015) (10)
Emotional Speech Corpus for Persuasive Dialogue System (2020) (10)
Automatic detection of very early stage of dementia through multimodal interaction with computer avatars (2016) (10)
Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system (2001) (10)
Dialogue management for leading the conversation in persuasive dialogue systems (2013) (10)
HMM-based feature compensation method: an evaluation using the AURORA2 (2004) (10)
Optimizing DPGMM Clustering in Zero Resource Setting Based on Functional Load (2018) (10)
Modified post-filter to recover modulation spectrum for HMM-based speech synthesis (2014) (10)
Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis (2020) (10)
Temporal modulation normalization for robust speech feature extraction and recognition (2009) (10)
Integration of noise reduction algorithms for Aurora2 task (2003) (10)
Tone nucleus-based multi-level robust acoustic tonal modeling of sentential F0 variations for Chinese continuous speech tone recognition (2005) (10)
Assessing the Quality of Wikipedia Editors through Crowdsourcing (2016) (10)
An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement (2014) (9)
Improving spontaneous English ASR using a joint-sequence pronunciation model (2010) (9)
Optimizing Computer-Assisted Transcription Quality with Iterative User Interfaces (2016) (9)
Toward Expressive Speech Translation: A Unified Sequence-to-Sequence LSTMs Approach for Translating Words and Emphasis (2017) (9)
Using Spoken Word Posterior Features in Neural Machine Translation (2018) (9)
Grapheme-to-phoneme conversion based on adaptive regularization of weight vectors (2013) (9)
Combination of Example-based and SMT-based Approaches in a Chat-oriented Dialog System (2013) (9)
Room acoustics and reverberation: impact on hands-free recognition (1997) (9)
Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model (2011) (9)
Training Neural Machine Translation using Word Embedding-based Loss (2018) (9)
Speaking rate compensation based on likelihood criterion in acoustic model training and decoding (2002) (9)
Reinforcement Learning in Multi-Party Trading Dialog (2015) (9)
AmbientBrowser: Web Browser for Everyday Enrichment (2005) (9)
Instance-Level Heterogeneous Domain Adaptation for Limited-Labeled Sketch-to-Photo Retrieval (2020) (9)
Construction and Analysis of a Persuasive Dialogue Corpus (2014) (9)
Sequence-to-Sequence Models for Emphasis Speech Translation (2018) (9)
Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems (2008) (9)
NICT-NAIST System for WMT17 Multimodal Translation Task (2017) (9)
Collection and analysis of a Japanese-English emphasized speech corpora (2014) (9)
Stochastic Gradient Variational Bayes for deep learning-based ASR (2015) (9)
Statistical sound source identification in a real acoustic environment for robust speech recognition using a microphone array (2001) (8)
Listening Skills Assessment through Computer Agents (2018) (8)
Recent progress in developing grapheme-based speech recognition for Indonesian ethnic languages: Javanese, Sundanese, Balinese and Bataks (2014) (8)
Non-audible murmur enhancement based on statistical conversion using air- and body-conductive microphones in noisy environments (2015) (8)
An Adaptive Integration Method Based on Product HMM for Bi-Modal Speech Recognition (2001) (8)
Automatic steering of microphone array and video camera toward multi-lingual tele-conference through speech-to-speech translation (2001) (8)
Learning cooperative persuasive dialogue policies using framing (2016) (8)
The 2012 KIT and KIT-NAIST English ASR systems for the IWSLT evaluation (2012) (8)
Video Search by Impression Extracted from Social Annotation (2009) (8)
Implementation of F0 transformation for statistical singing voice conversion based on direct waveform modification (2016) (8)
Spoken Dialogue Systems for Ambient Environments (2010) (8)
Model-based talking face synthesis for anthropomorphic spoken dialog agent system (2003) (8)
Graph Regularized Tensor Factorization for Single-Trial EEG Analysis (2018) (8)
Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery (2018) (8)
Integration of articulatory dynamic parameters in HMM/BN based speech recognition system (2004) (8)
Hybrid HMM/BN LVCSR system integrating multiple acoustic features (2003) (8)
An investigation of acoustic features for singing voice conversion based on perceptual age (2013) (8)
A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System (2011) (8)
Graph matching based anime colorization with multiple references (2019) (8)
Plus One or Minus One: A Method to Browse from an Object to Another Object by Adding or Deleting an Element (2010) (8)
Subjective Evaluation for HMM-Based Speech-To-Lip Movement Synthesis (1998) (8)
Residual noise compensation by a sequential EM algorithm (2000) (8)
Use of Poisson Processes to Generate Fundamental Frequency Contours (2007) (8)
Noise adaptive speech recognition in time-varying noise based on sequential kullback proximal algorithm (2002) (8)
Modeling successive frame dependencies with hybrid HMM/BN acoustic model (2005) (8)
Speech recognition for a distant moving speaker based on HMM composition and separation (2000) (8)
Semantic Parsing of Ambiguous Input through Paraphrasing and Verification (2015) (8)
Construction and analysis of Indonesian Emotional Speech Corpus (2014) (8)
Transferring Emphasis in Speech Translation Using Hard-Attentional Neural Network Models (2016) (7)
A latent variable model for joint pause prediction and dependency parsing (2015) (7)
A study on soft margin estimation of linear regression parameters for speaker adaptation (2009) (7)
Recognizing Emotionally Coloured Dialogue Speech Using Speaker-Adapted DNN-CNN Bottleneck Features (2017) (7)
Incremental TTS for Japanese Language (2018) (7)
A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task (2017) (7)
Conversation dialog corpora from television and movie scripts (2014) (7)
Recognition and translation of code-switching speech utterances (2019) (7)
Energy browser: to make exercise enjoyable and interesting (2005) (7)
Evaluating credibility of web information (2010) (7)
Quality prediction of synthesized speech based on tensor structured EEG signals (2018) (7)
Adaptive Regularization Framework for Robust Voice Activity Detection (2011) (7)
The use of semantic and acoustic features for open-domain TED talk summarization (2014) (7)
An empirical comparison of joint optimization techniques for speech translation (2013) (7)
Spoken Dialog System on Plasma Display Panel Estimating Users' Interest by Image Processing (2010) (7)
Robust bi-modal speech recognition based on state synchronous modeling and stream weight optimization (2002) (7)
Improving translation of emphasis with pause prediction in speech-to-speech translation systems (2015) (7)
Music signal separation based on Bayesian spectral amplitude estimator with automatic target prior adaptation (2014) (7)
Improving the robustness of example-based dialog retrieval using recursive neural network paraphrase identification (2014) (7)
Incremental sentence compression using LSTM recurrent networks (2015) (7)
Data-driven efficient production of cartoon character animation (2007) (7)
Towards High-Reliability Speech Translation in the Medical Domain (2013) (7)
Cluster-based language model for spoken document retrieval using NMF-based document clustering (2010) (7)
DETECTION OF SPEECH EVENTS IN REAL ENVIRONMENTS THROUGH FUSION OF AUDIO AND VIDEO INFORMATION USING BAYESIAN NETWORKS (2003) (6)
Real time face detection for multimodal speech recognition (2002) (6)
Blind noise suppression for Non-Audible Murmur recognition with stereo signal processing (2011) (6)
Local Monotonic Attention Mechanism for End-to-End Speech Recognition (2017) (6)
SlothLib: A Programming Library for Researches on the Web (2007) (6)
Semantically readable distributed representation learning for social media mining (2017) (6)
The KIT-NAIST (contrastive) English ASR system for IWSLT 2012 (2012) (6)
Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition (2016) (6)
An effect of adaptive beamforming on hands-free speech recognition based on 3-d viterbi search (1998) (6)
Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors (2021) (6)
Class-Dependent Modeling for Dialog Translation (2009) (6)
Learning local word reorderings for hierarchical phrase-based statistical machine translation (2016) (6)
End-to-End Speech Recognition Sequence Training With Reinforcement Learning (2019) (6)
Discriminative Language Models as a Tool for Machine Translation Error Analysis (2014) (6)
An adaptive integration based on product hmm for audio-visual speech recognition (2001) (6)
A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion (2013) (6)
WeBrowSearch: Toward Web Browser with Autonomous Search (2007) (6)
Initial response time measurement in eye movement for dementia screening test (2017) (6)
Segmentation for Efficient Supervised Language Annotation with an Explicit Cost-Utility Tradeoff (2014) (6)
An end-to-end model for cross-lingual transformation of paralinguistic information (2018) (6)
Analysis of Mood Changes and Facial Expressions during Cognitive Behavior Therapy through a Virtual Agent (2020) (6)
A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training (2016) (6)
A study of social-affective communication: Automatic prediction of emotion triggers and responses in television talk shows (2015) (6)
A Hybrid HMM/BN Acoustic Model Utilizing Pentaphone-Context Dependency (2006) (6)
Direct F0 control of an electrolarynx based on statistical excitation feature prediction and its evaluation through simulation (2014) (6)
Japanese-English Code-Switching Speech Data Construction (2018) (6)
Analyzing Self-Efficacy and Summary Feedback in Automated Social Skills Training (2021) (6)
Supplementation of HMM for articulatory variation in speaker adaptation (1990) (6)
Eliciting Positive Emotional Impact in Dialogue Response Selection (2018) (6)
Tracking liking state in brain activity while watching multiple movies (2017) (6)
Model based noisy speech recognition with environment parameters estimated by noise adaptive speech recognition with prior (2003) (6)
Design of Software Toolkit for Anthropomorphic Spoken Dialog Agent Software with Customization-oriented Features (2002) (6)
Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models (2004) (6)
Speech-to-face movement synthesis based on HMMS (2000) (6)
The NAIST English speech recognition system for IWSLT 2015 (2015) (6)
Adaptation of Model Parameters by HMM Decomposition in Noisy Reverberant Environments (1997) (6)
Galatea : An Anthropomorphic Spoken Dialogue Agent Toolkit (2003) (6)
Construction and analysis of social-affective interaction corpus in English and Indonesian (2015) (6)
Theoretical analysis of biased MMSE short-time spectral amplitude estimator and its extension to musical-noise-free speech enhancement (2014) (6)
Zero-Shot Code-Switching ASR and TTS with Multilingual Machine Speech Chain (2019) (5)
Structured soft margin confidence weighted learning for grapheme-to-phoneme conversion (2014) (5)
WAVE FIELD CANCELLATION USING WAVE-DOMAIN ADAPTIVE FILTERING (2004) (5)
Objective Prediction of Social Skills Level for Automated Social Skills Training Using Audio and Text Information (2020) (5)
Improving Pivot Translation by Remembering the Pivot (2015) (5)
Non-native speech synthesis preserving speaker individuality based on partial correction of prosodic and phonetic characteristics (2015) (5)
Towards Improving Web Search: A Large-Scale Exploratory Study of Selected Aspects of User Search Behavior (2009) (5)
Multimodal interaction data between clinical psychologists and students for attentive listening modeling (2016) (5)
NAIST at the CLEF 2013 QA4MRE Pilot Task (2013) (5)
Detecting suppression of negative emotion by time series change of cerebral blood flow using fNIRS (2018) (5)
Deep bottleneck features and sound-dependent i-vectors for simultaneous recognition of speech and environmental sounds (2016) (5)
A Study on Cross Transformation of Mongolian Language (2008) (5)
Instant Movie Casting with Personality: Dive into the Movie System (2011) (5)
Investigation of ASR Systems for Resource-deficient Languages (2010) (5)
Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data (2021) (5)
Speech-to-lip movement synthesis based on the EM algorithm using audio-visual HMMs (1998) (5)
Dialogue Acts Annotation for NICT Kyoto Tour Dialogue Corpus to Construct Statistical Dialogue Systems (2010) (5)
Speech Quality Evaluation of Synthesized Japanese Speech Using EEG (2019) (5)
Recent Progress in Developing Indonesian Large-Vocabulary Corpora and LVCSR System (2008) (5)
Dialogue strategy optimization to assist user's decision for spoken consulting dialogue systems (2010) (5)
Multi-lingual speech recognition system for speech-to-speech translation (2004) (5)
End-to-End Image-to-Speech Generation for Untranscribed Unknown Languages (2021) (5)
Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues (2006) (5)
Articulatory controllable speech modification based on statistical feature mapping with Gaussian mixture models (2014) (5)
Two Types of Disagreement in Group Discussions of Japanese Undergraduates (2009) (5)
Minimum mean square error filtering of noisy cepstral coefficients with applications to ASR (2004) (5)
A Visual Analytics Tool for System Logs Adopting Variable Recommendation and Feature-Based Filtering (2013) (5)
Robust verification of recognized words in noise (2004) (5)
Automatic Face Tracking And Model Match-Move In Video Sequence Using 3d Face Model (2001) (5)
An Analysis Towards Dialogue-Based Deception Detection (2015) (5)
Noise adaptive speech recognition with acoustic models trained from noisy speech evaluated on Aurora-2 database (2002) (5)
Weighted graph based decision tree optimization for high accuracy acoustic modeling (2002) (5)
An Improved Greedy Search Algorithm for the Development of a Phonetically Rich Speech Corpus (2008) (5)
Feature extraction and model-based noise compensation for noisy speech recognition evaluated on AURORA 2 task (2001) (5)
AmbientBrowser: Web Browser in Everyday Life (2006) (5)
A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis (2009) (5)
An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering (2015) (5)
Personalized unknown word detection in non-native language reading using eye gaze (2016) (5)
Construction and Experiment of a Spoken Consulting Dialogue System (2010) (5)
Introduction to the Special Issue on Spontaneous Speech Processing (2004) (5)
Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model (2008) (5)
Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings (2020) (5)
Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments (2020) (5)
Speech-to-lip movement synthesis maximizing audio-visual joint probability based on EM algorithm (1998) (5)
Articulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential (2015) (4)
Progress Report of SLP Noisy Speech Recognition Evaluation WG : Individual evaluation framework for each factor affecting recognition performance (2) (2006) (4)
Discriminating Chinese lexical tones by anchoring F0 features (2000) (4)
Conversational Response Re-ranking Based on Event Causality and Role Factored Tensor Event Embedding (2019) (4)
Modeling HMM state distributions with Bayesian networks (2002) (4)
Additional Operations of Simple HITs on Microtask Crowdsourcing for Worker Quality Prediction (2019) (4)
Tree as a Pivot: Syntactic Matching Methods in Pivot Translation (2017) (4)
Suitable design of adaptive beamformer based on average speech spectrum for noisy speech recognition (2002) (4)
Pre- and post-processes for automatic colorization using a fully convolutional network (2018) (4)
Statistical Speech Recognition (2009) (4)
Simultaneous Neural Machine Translation using Connectionist Temporal Classification (2019) (4)
The NAIST Simultaneous Translation Corpus (2018) (4)
Ensembles of Multi-Scale VGG Acoustic Models (2017) (4)
Neural Speech Completion (2020) (4)
Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling (2018) (4)
Temporal contrast normalization and edge-preserved smoothing on temporal modulation structure for robust speech recognition (2009) (4)
AN EVALUATION OF SPEECH ENHANCEMENT APPROACH E-CMN/CSS FOR SPEECH RECOGNITION IN CAR ENVIRONMENTS (1998) (4)
Information Navigation System with Discovering User Interests (2017) (4)
Design of robust subtractive beamformer for noisy speech recognition (2000) (4)
Suppression of noise and late reverberation based on blind signal extraction and Wiener filtering (2015) (4)
Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics (2016) (4)
Speech Parameter Generation Algorithm Considering Modulation Spectrum for Statistical Parametric Speech Synthesis (2015) (4)
Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications (2004) (4)
3D N-best search for simultaneous recognition of distant-talking speech of multiple talkers (2002) (4)
An Enhanced Electrolarynx with Automatic Fundamental Frequency Control based on Statistical Prediction (2015) (4)
Towards the creation of acoustic models for stressed Japanese speech (2001) (4)
Robust word spotting in adverse car environments (1993) (4)
Hands-free Speech Recognition by a microphone array and HMM composition (1996) (4)
Online cepstral filtering using a sequential EM approach with Polyak averaging and feedback [speech recognition applications] (2005) (4)
Compression algorithm of trigram language models based on maximum likelihood estimation (1998) (4)
Music Generation and Emotion Estimation from EEG Signals for Inducing Affective States (2020) (4)
Extracting adjective facets from community Q&A corpus (2011) (4)
Improving Acoustic Model Precision by Incorporating a Wide Phonetic Context Based on a Bayesian Framework (2006) (4)
Distributed speech translation technologies for multiparty multilingual communication (2012) (4)
Predicting query reformulation type from user behavior (2013) (4)
Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments (2009) (4)
Multiple beamforming with source localization based on CSP analysis (2003) (4)
Automatic indexing of broadcast content using its live chat on the Web (2005) (4)
NICT/ATR Asian Spoken Language Translation System for Multi-Party Travel Conversation (2009) (4)
Automatic Derivation of a Phoneme Set with Tone Information for Chinese Speech Recognition Based on Mutual Information Criterion (2006) (4)
Processing negative emotions through social communication: Multimodal database construction and analysis (2017) (4)
Emotional Triggers and Responses in Spontaneous Affective Interaction: Recognition, Prediction, and Analysis (2018) (4)
Recognition of Distant-Talking Speech based on 3-D Trellis Search using a Microphone Array and Adaptive Beamforming (1999) (4)
The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 (2008) (4)
An HMM acoustic model incorporating various additional knowledge sources (2007) (4)
An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis (2012) (4)
Frequency Modulation Technique for Prosodic Modification (2008) (4)
VRMixer: mixing video and real world with video segmentation (2014) (4)
Combining Audio and Brain Activity for Predicting Speech Quality (2020) (4)
Towards language preservation: Design and collection of graphemically balanced and parallel speech corpora of Indonesian ethnic languages (2013) (4)
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System (2014) (4)
Rapid environment adaptation method based on HMM composition with prior noise GMM and multi‐SNR models for noisy speech recognition (2004) (4)
Using Panoramic Videos for Multi-Person Localization and Tracking In A 3D Panoramic Coordinate (2019) (4)
Dialogue Model and Response Generation for Emotion Improvement Elicitation (2019) (4)
An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition (2010) (4)
CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments (2011) (4)
Construction of Chinese conversational corpora for spontaneous speech recognition and comparative study on the trilingual parallel corpora (2009) (4)
Spoken Dialogue Robot for Watching Daily Life of Elderly People (2019) (4)
Probabilistic Pronunciation Variation Model Based on Bayesian Network for Conversational Speech Recognition (2008) (4)
Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm (2004) (4)
Incorporating Knowledge Sources Into a Statistical Acoustic Model for Spoken Language Communication Systems (2007) (4)
Iterative Estimation and Compensation of Signal Direction for Moving Sound Source by Mobile Microphone Array (2004) (4)
Incorporation of Pentaphone-Context Dependency Based on Hybrid Hmm/Bn Acoustic Modeling Framework (2006) (3)
Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition (2010) (3)
Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation (2014) (3)
Face expression synthesis based on a facial motion distribution chart (2004) (3)
MAP estimation of online mapping parameters in ensemble speaker and speaking environment modeling (2009) (3)
Fast text anonymization using k-anonyminity (2016) (3)
Statistical Imitation Learning in Sequential Object Manipulation Tasks (2010) (3)
Statistical modeling of binaural signal and its application to binaural source separation (2015) (3)
Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition (2016) (3)
Dialog Management of Healthcare Consulting System by Utilizing Deceptive Information (2020) (3)
Construction of Chinese Segmented and POS-tagged Conversational Corpora and Their Evaluations on Spontaneous Speech Recognitions (2009) (3)
The NAIST machine translation system for IWSLT2012 (2012) (3)
Unsupervised Counselor Dialogue Clustering for Positive Emotion Elicitation in Neural Dialogue System (2018) (3)
Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain (2019) (3)
Memorable spoken quote corpora of TED public speaking (2014) (3)
Utilizing a noisy-channel approach for Korean LVCSR (2010) (3)
Associative knowledge feature vector inferred on external knowledge base for dialog state tracking (2019) (3)
Automated social skills training with audiovisual information (2016) (3)
Deja-vu: Double Feature Presentation in Deep Transformer Networks (2019) (3)
The NAIST ASR system for the 2015 Multi-Genre Broadcast challenge: On combination of deep learning systems using a rank-score function (2015) (3)
Quality and Intelligibility Assessment of Indonesian HMM-Based Speech Synthesis System (2010) (3)
Conditional Random Fields for Modeling Korean Pronunciation Variation (2011) (3)
Dialogue Acts Annotation to Construct Dialogue Systems for Consulting (2011) (3)
The Present Status of Speech Database in Japan: Development, Management, and Application to Speech Research (2002) (3)
An inter-speaker evaluation through simulation of electrolarynx control based on statistical F0 prediction (2014) (3)
Soft margin estimation on improving environment structures for ensemble speaker and speaking environment modeling (2009) (3)
Noise and Channel Distortion Robust ASR System for DARPA SPINE2 Task (2003) (3)
Improvement of Speech Recognition Method Using Speech Production Mechanism (2003) (3)
Narrow Adaptive Regularization of weights for grapheme-to-phoneme conversion (2014) (3)
Noise suppression method for body-conducted soft speech enhancement based on external noise monitoring (2016) (3)
Construction of Audio-Visual Speech Corpus Using Motion-Capture System and Corpus Based Facial Animation (2005) (3)
AURORA-2 J : An Evaluation Framework for Japanese Noisy Speech Recognition (2005) (3)
Detecting Dementia from Face in Human-Agent Interaction (2019) (3)
Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective (2019) (3)
Gender-dependent spectrum differential models for perceived age control based on direct waveform modification in singing voice conversion (2014) (3)
An Introduction to the Chinese Speech Recognition Front-End of the NICT/ATR Multi-Lingual Speech Translation System (2008) (3)
Efficient lip-synch tool for 3D cartoon animation (2008) (3)
Evaluation and Interpretation of 9 Body Constitution Scores of CCMQ-J by Seven Independent Questionnaires (2019) (3)
A block cosine transform and its application in speech recognition (2000) (3)
On-the-fly user modeling for cost-sensitive correction of speech transcripts (2014) (3)
Forward-backwards training of hybrid HMM/BN acoustic models (2006) (3)
Query Relaxation Based on Users' Unconfidences on Query Terms and Web Knowledge Extraction (2008) (3)
A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition (2001) (3)
Automatic voice assignment tool for Instant Casting movie System (2009) (3)
Evaluation of a Fully Automatic Cooperative Persuasive Dialogue System (2015) (3)
Toward translating Indonesian spoken utterances to/from other languages (2009) (3)
Spoken dialog system and its evaluation of geographic information system for elderly persons' mobility support (2005) (3)
Anime Character Colorization using Few-shot Learning (2021) (3)
Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion (2013) (3)
Data collection and evaluation of AURORA-2 Japanese corpus [speech recognition applications] (2003) (3)
WHITENING PROCESSING FOR BLIND SEPARATION OF SPEECH SIGNALS (2003) (3)
Adaptation of acoustic model using the gain-adapted HMM decomposition method (2003) (3)
Automatic Generation of Non-uniform and Context-Dependent HMMs Based on the Variational Bayesian Approach (2005) (3)
Query Transformation by Visualizing and Utilizing Information about What Users Are or Are Not Searching (2008) (3)
Evaluation of Facial Direction Estimation from Cameras for Multi-modal Spoken Dialog System (2010) (3)
Structured Adaptive Regularization of Weight Vectors for a Robust Grapheme-to-Phoneme Conversion Model (2014) (3)
NAIST English-to-Japanese Simultaneous Translation System for IWSLT 2021 Simultaneous Text-to-text Task (2021) (3)
Multimodal Dataset of Social Skills Training in Natural Conversational Setting (2021) (3)
Modality and contextual differences in computer based non-verbal communication training (2013) (3)
Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features (2004) (3)
Korean pronunciation variation modeling with probabilistic Bayesian networks (2010) (3)
AURORA - 2J/AURORA - 3J Corpus and Evaluation Baseline (2003) (3)
A noise‐robust speech input interface for information kiosk terminals (2004) (3)
"Developing a Test Bed of English Text-to-Speech System XIMERA for the Blizzard Challenge 2006 for the Blizzard Challenge 2006" (2006) (3)
Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments (2009) (3)
ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions (2021) (3)
NAIST’s Machine Translation Systems for IWSLT 2020 Conversational Speech Translation Task (2020) (3)
Towards Robust Speech Recognition in Real Acoustic Environments (2002) (3)
Optimization of Information-Seeking Dialogue Strategy for Argumentation-Based Dialogue System (2018) (3)
Adaptive selection from multiple response candidates in example-based dialogue (2015) (3)
A Binarized Neural Network Joint Model for Machine Translation (2015) (3)
Sequential Attention-based Detection of Semantic Incongruities from EEG While Listening to Speech (2020) (2)
An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition (2003) (2)
Multiple Sound Sources Recognition by a Microphone Array-Based 3-D N-Best Search with Likelihood (2001) (2)
Semantically Readable Distributed Representation Learning and Its Expandability Using a Word Semantic Vector Dictionary (2018) (2)
Neural Oscillation-Based Classification of Japanese Spoken Sentences During Speech Perception (2019) (2)
Speaker weighted training of HMM using multiple reference speakers (1990) (2)
ReMOTS: Refining Multi-Object Tracking and Segmentation (2020) (2)
Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework (2020) (2)
Regularization in a reproducing kernel hubert space for robust voice activity detection (2010) (2)
The present status, progress, and usage of speech databases in Japan (2005) (2)
Toward Multi-Features Emphasis Speech Translation: Assessment of Human Emphasis Production and Perception with Speech and Text Clues (2018) (2)
Incorporating a Bayesian wide phonetic context model for acoustic rescoring (2005) (2)
Multi-modal translation system and its evaluation (2002) (2)
Active Learning for Generating Motion and Utterances in Object Manipulation Dialogue Tasks (2010) (2)
Hierarchical Tensor Fusion Network for Deception Handling Negotiation Dialog Model (2019) (2)
Brazilian portuguese acoustic model training based on data borrowing from other language (2010) (2)
Calendar for everything: browse and search for personal archive on calendar (2008) (2)
Social Bookmarking and Web Search (2010) (2)
Speech to talking heads system based on hidden Markov models (2005) (2)
A trade-off between estimation accuracy of worker quality and task complexity (2017) (2)
Increasing discriminative capability on MAP-based mapping function estimation for acoustic model adaptation (2011) (2)
Personal TV viewing by using live chat as metadata (2005) (2)
Development of the "VoiceTra" Multi-Lingual Speech Translation System (2017) (2)
RerankEverything: a reranking interface for browsing search results (2010) (2)
Multi-Modal Multi-Task Deep Learning For Speaker And Emotion Recognition Of TV-Series Data (2018) (2)
A Study Toward an Evaluation Method for Spoken Dialogue Systems Considering User Criteria (2010) (2)
Spoken document retrieval using topic models (2009) (2)
A semi-blind source separation method for hands-free speech recognition of multiple talkers (2003) (2)
Real-time vibration control of an electrolarynx based on statistical F0 contour prediction (2016) (2)
Statistical F0 prediction for electrolaryngeal speech enhancement considering generative process of F0 contours within product of experts framework (2016) (2)
TermCloud for Enhancing Web Search (2009) (2)
Speech detection by facial image for multimodal speech recognition (2001) (2)
Evaluation of an HMM-based feature-compensation method using the AURORA2J [speech recognition] (2005) (2)
Rule-based Syntactic Preprocessing for Syntax-based Machine Translation (2014) (2)
PROSODY-CONTROLLABLE HMM-BASED SPEECH SYNTHESIS USING SPEECH INPUT (2015) (2)
Inter-Sentence Features and Thresholded Minimum Error Rate Training: NAIST at CLEF 2013 QA4MRE (2013) (2)
Improving neural machine translation through phrase-based soft forced decoding (2020) (2)
Minimum Bayes-risk decoding extended with similar examples: NAIST-NCT at IWSLT 2012 (2012) (2)
Transcribing Paralinguistic Acoustic Cues to Target Language Text in Transformer-Based Speech-to-Text Translation (2021) (2)
Voice Conversion Algorithm Based on Gaussian Mixture Model Applied to STRAIGHT (2000) (2)
Communicative speech synthesis with XIMERA: a first step (2007) (2)
Cellular-phone based speech-to-speech translation system ATR-MATRIX (2000) (2)
SESLA TRANSCRIBER: A SPEECH TRANSCRIPTION TOOL THAT ADAPTS TO YOUR SKILL AND TIME BUDGET (2014) (2)
Evaluation of electrolarynx controlled by real-time statistical F0 prediction (2016) (2)
A Vibration Control Method of an Electrolarynx Based on Statistical F0 Pattern Prediction (2017) (2)
EEG Analysis Towards Evaluating Synthesized Speech Quality (2019) (2)
An event-related brain potential study on the impact of speech recognition errors (2014) (2)
Data-driven generation of text balloons based on linguistic and acoustic features of a comics-anime corpus (2014) (2)
Life-Like Characters. Tools, Affective Functions, and Applications (2003) (2)
ACOUSTIC MODELING OF ACCENTED ENGLISH SPEECH FOR LARGE-VOCABULARY SPEECH RECOGNITION (2006) (2)
Investigation of intra-speaker spectral parameter variation and its prediction towards improvement of spectral conversion metric (2013) (2)
The NICT Entry for the Blizzard Challenge 2009: an Enhanced HMM-based Speech Synthesis System with Trajectory Training Considering Global Variance and State-Dependent Mixed Excitation (2009) (2)
Corpus Construction and Semantic Analysis of Indonesian Image Description (2018) (2)
On the Importance of Pivot Language Selection for Asian Language Translation (2009) (2)
Model training using parallel data with mismatched pause positions in statistical esophageal speech enhancement (2012) (2)
Speech Artifact Removal from Eeg Recordings of Spoken Word Production with Tensor Decomposition (2019) (2)
Phoneme-level speaking rate variation on waveform generation using GAN-TTS (2019) (2)
Network-based speech-to-speech translation (2009) (2)
Simultaneous Neural Machine Translation with Constituent Label Prediction (2021) (2)
Voice activity detection in a regularized reproducing kernel Hilbert space (2010) (2)
SyncRerank: Reranking Multi Search Results Based on Vertical and Horizontal Propagation of User Intention (2008) (2)
Optimized joint noise suppression and dereverberation based on blind signal extraction for hands-free speech recognition system (2014) (2)
Confidence-aware Practical Anime-style Colorization (2020) (2)
Cross-lingual Speech-based Tobi Label Generation Using Bidirectional Lstm (2019) (2)
Developing Client-Server Speech Translation Platform (2006) (2)
A multilevel framework to model the inherently confounding nature of sentential F0sentential F0 contours contours for recognizing Chinese lexical tones (2003) (2)
The use of Bayesian network for incorporating accent, gender and wide-context dependency information (2006) (2)
Unsupervised Neural-Based Graph Clustering for Variable-Length Speech Representation Discovery of Zero-Resource Languages (2021) (2)
Subject-Independent Classification of Japanese Spoken Sentences by Multiple Frequency Bands Phase Pattern of EEG Response During Speech Perception (2017) (2)
LVCSR Robust to Noise and Speaking Styles (2004) (2)
Reranking and Classifying Search Results Exhaustively Based on Edit-and-Propagate Operations (2009) (2)
Measuring Affective Sharing between Two People by EEG Hyperscanning (2019) (2)
Tackling multiple object tracking with complicated motions - Re-designing the integration of motion and appearance (2022) (2)
A microphone array-based 3-D N-best search algorithm for the simultaneous recognition of multiple sound sources in real environments (2001) (2)
Information Filtering Method for Twitter Streaming Data Using Human-in-the-Loop Machine Learning (2018) (2)
Detecting Syntactic Violations from Single-trial EEG using Recurrent Neural Networks (2019) (2)
Optimizing Neural Response Generator with Emotional Impact Information (2018) (2)
Variable Selection Linear Regression for Robust Speech Recognition (2014) (1)
Efficient lip‐synch tool for 3D cartoon animation (2008) (1)
Subjective evaluation of a synthetic talking face in an acoustically noisy environment (2006) (1)
Scenario speech assignment technique for instant casting movie system (2009) (1)
Integrating lip-synch into game production workflow: "Sengoku BASARA 3" (Copyright restrictions prevent ACM from providing the full text for this article) (2010) (1)
Creating speaker independent HMM models for restricted database using STRAIGHT-TEMPO morphing (1998) (1)
User Study of Spoken Decision Support System (2011) (1)
Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach (2004) (1)
Toward musical-noise-free blind speech extraction: Concept and its applications (2013) (1)
NICT Blizzard Challenge 2010 Entry (2010) (1)
Incremental Machine Speech Chain Towards Enabling Listening While Speaking in Real-Time (2020) (1)
Digital watermarks for audio signal based on psychoacoustic masking model (2003) (1)
An Interactive Image Editing System Using an Uncertainty-Based Confirmation Strategy (2020) (1)
Normalization on Temporal Modulation Transfer Function for Robust Speech Recognition (2008) (1)
Speech recognition based on HMM decomposition and composition method with a microphone array in noisy reverberant environments (2002) (1)
An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition (2003) (1)
Consolidation-Based Speech Translation and Evaluation Approach (2009) (1)
Verifying LVCSR Output at Different Levels with Generalized Posterior Probability (2004) (1)
Efficient representation of short-time phase based on time-domain smoothed group delay (2003) (1)
Towards Standardization and Evaluation Framework for Noisy Speech Recognition (2004) (1)
Speech Recognition of a Moving Talker Based on 3-D Viterbi Search Using a Microphone Array (1997) (1)
Study on Word-Level Emphasis Across English and Japanese ∗ ☆ (2015) (1)
Robust Acoustic Modeling of Contextual Tonal F 0 Variations on the Basis of Tone Nucleus Framework (2004) (1)
Entrainable Neural Conversation Model Based on Reinforcement Learning (2020) (1)
Optimal learning of P-Layer additive F0 models with cross-validation (2009) (1)
A design of adaptive beamformer based on average speech spectrum for noisy speech recognition (2002) (1)
Single-Trial Detection of Semantic Anomalies From EEG During Listening to Spoken Sentences (2018) (1)
Reversible display: novel interaction techniques for digital contents (2005) (1)
Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement (2017) (1)
Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas (2018) (1)
Expansion of WFST-Based Dialog Management for Handling Multiple ASR Hypotheses (2010) (1)
A Statistical Approach to Expandable Spoken Dialog Systems using WFSTs (2008) (1)
Hyperbolic structure of fundamental frequency contour (2009) (1)
Semantic Parsing of Ambiguous Input using Multi Synchronous Grammars (2016) (1)
Speech-to-Speech Translation without Text (2020) (1)
Image Captioning with Visual Object Representations Grounded in the Textual Modality (2020) (1)
Hierarchical topic classification for dialog speech recognition based on language model switching (2003) (1)
Proceedings of the 17th International Conference on Spoken Language Translation (2020) (1)
Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation (2021) (1)
Developing Robust Baseline Acoustic Models for Noisy Speech Recognition in SPINE2 Project (2002) (1)
Dialogue Act Annotation for Statistically Managed Spoken Dialogue Systems (2008) (1)
Analysis of selective attention processing on experienced simultaneous interpreters using EEG phase synchronization (2020) (1)
An estimation method of voice timbre evaluation values using feature extraction with Gaussian mixture model based on reference singer (2016) (1)
Large Vocabulary ASR System based on the Hybrid HMM/BN model (2002) (1)
Improvements to HMM-based speech synthesis based on parameter generation with rich context models (2013) (1)
Automatic Speech Recognition (2011) (1)
Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach (2012) (1)
Dialogue Act Classification in Reference Interview Using Convolutional Neural Network with Byte Pair Encoding (2018) (1)
Length-constrained Neural Machine Translation using Length Prediction and Perturbation into Length-aware Positional Encoding (2021) (1)
Divergence optimization based on trade-off between separation and extrapolation abilities in superresolution-based nonnegative matrix factorization ∗ (2013) (1)
Spoken Language Technologies for Universal Communication (2007) (1)
Object Manipulation Dialogue by Estimating Utterance Understanding Probability in a Robot Language Acquisition Framework (2010) (1)
NOCOA: A Computer-Based Training Tool for Social and Communication Skills That Exploits Non-verbal Behaviors (2013) (1)
Facial movement synthesis by HMM from audio speech (2002) (1)
Development and application of multilingual speech translation (2009) (1)
CART-based modeling of Chinese tonal patterns with a functional model tracing the fundamental frequency trajectories (2009) (1)
Noise reduction using paired-microphones for both far-field and near-field sound sources (2001) (1)
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR (2022) (1)
Evaluation of Sound Source Discrimination Based on HMMs Using a Microphone Array (2000) (1)
Example Based Dialogue System Based on Satisfaction Prediction (2016) (1)
Removing noise from event-related potentials using a probabilistic generative model with grouped covariance matrices (2016) (1)
An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model (信号処理) (2016) (1)
Deception Detection and Analysis in Spoken Dialogues based on FastText (2018) (1)
An evaluation of EEG ocular artifact removal with a multi-channel wiener filter based on probabilistic generative model (2015) (1)
VQ-based speaker adaptation applied to HMM phoneme recognition (1989) (1)
Anaphora Resolution for Transforming Regular Expressions into Honorifics in Japanese (2014) (1)
Development of "VoiceTra" Multi-Lingual Speech Translation System for Practical Use (2013) (1)
Dialogue Structure Parsing on Multi-Floor Dialogue Based on Multi-Task Learning (2021) (1)
Multi-paraphrase Augmentation to Leverage Neural Caption Translation (2018) (1)
Evaluation of 3-D N-best Search Using Path Distance-based Clustering for Recognizing Multiple Sound Sources (2000) (1)
Noise Speech Recognition based on Robust Features and A Model-Based Noise Compensation evaluated on Aurora-2 Task (2001) (1)
SUBTRACTION OF ADDITIVE NOISE FROM CORRUPTED SPEECH FOR ROBUST SPEECH RECOGNITION (2001) (1)
Low delay statistical singing voice conversion with direct waveform modification based on spectral differential considering global variance (2016) (1)
Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model (2016) (1)
Modeling varying pauses to develop robust acoustic models for recognizing noisy conversational speech (2002) (1)
Multimodal corpora for human-machine interaction research (2000) (1)
Reflection-based Word Attribute Transfer (2020) (1)
Human-Machine Communication by Audio-Visual Integration (2005) (1)
Impact of Deception Information on Negotiation Dialog Management: A Case Study on Doctor-Patient Conversations (2018) (1)
Social Affective Multimodal Interaction for Health (2020) (1)
Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems (2017) (1)
Dialogue act annotation for consulting dialogue corpus (2009) (1)
Reinforcement Learning of Multi-Party Trading Dialog Policies (2015) (1)
Spoken Dialog System for Next Generation Knowledge Access (2008) (1)
Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis (2015) (1)
Speech recognition of foreign out-of-vocabulary words using a hierarchical language model (2006) (1)
Theoretical Analysis of Musical Noise Generation for Blind Speech Extraction with Generalized MMSE Short-Time Spectral Amplitude Estimator (2013) (1)
Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration (2014) (1)
A SUCCESSIVE STATE SPLITTING ALGORITHM BASED ON THE MDL CRITERION BY DATA-DRIVEN AND DECISION TREE CLUSTERING (2003) (1)
Creation of a multi-paraphrase corpus based on various elementary operations (2017) (1)
A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models (2016) (1)
A Robust Bimodal Speech Section Detection (2004) (1)
Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition (2018) (1)
Parser self-training for syntax-based machine translation (2015) (1)
Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation (2020) (1)
Using Local Phrase Dependency Structure Information in Neural Sequence-to-Sequence Speech Synthesis (2021) (1)
An Evaluation through Simulation of Electrolarynx Control based on Statistical F 0 Prediction for Multiple Speakers (2014) (1)
Evaluation of a noise adaptive speech recognition system on the Aurora 3 database (2002) (1)
Noise reduction using paired-microphones on non-equally-spaced microphone arrangement (2003) (1)
Enhancing Event-Related Potentials Based on Maximum a Posteriori Estimation with a Spatial Correlation Prior (2016) (1)
A hearing impairment simulation method using audiogram-based approximation of auditory charatecteristics (2014) (1)
A Word - spotting Hypothesis Testing for Accepting/Rejecting Continuous Speech Recognition Output (2003) (1)
Non-verbal Communication Training with an Interactive Multimedia Application (2014) (1)
Dynamically Adaptive Machine Speech Chain Inference for TTS in Noisy Environment: Listen and Speak Louder (2021) (1)
Neural Machine Translation with Acoustic Embedding (2019) (1)
Oriental COCOSDA: Past, Present and Future (2006) (1)
Speech enhancement as a functional approximation and generalization (2010) (1)
A method to integrate additional knowledge sources into HMM based on junction tree decomposition (2007) (1)
Emotion Estimation from EEG Signals and Expected Subjective Evaluation (2021) (1)
Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion (2016) (1)
Recurrent Neural Network Compression Based on Low-Rank Tensor Representation (2020) (1)
Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot (2021) (1)
An evaluation of target speech for a nonaudible murmur enhancement system in noisy environments (2014) (1)
Content Browsing by Walking in Real and Cyber Spaces (2005) (1)
Toward editable web browser: edit-and-propagate operation for web browsing (2007) (1)
Evaluation of Speech Interface and Dialog Experiments for Elderly People (2005) (0)
Development of Indonesian Spoken Language Technologies for Multilingual Speech-to-Speech Translation System (2009) (0)
The Use of Bayesian Networ Accent , Gender and Wide-Context (2006) (0)
End-to-End Speech Recognition with Local Monotonic Attention (2017) (0)
Non-native ASR Utilizing Acoustic Data-driven Pronunciation Learning with Zero Knowledge of Non-native Pronunciation (2015) (0)
A Joint Model for Pause Prediction and Dependency Parsing using Latent Variables The (2016) (0)
Context awareness and priority control for ITS based on automatic speech recognition (2015) (0)
Controlled Neural Response Generation by Given Dialogue Acts Based on Label-aware Adversarial Learning (2021) (0)
Predicting Emotional Responses From Spontaneous Social-Affective Interaction Data ∗ ☆ (2016) (0)
Rapid Model Adaptation with a Prior Noise GMM and Multi-SNR Models for Noisy Speech Recognition (2001) (0)
Towards real-time multilingual multimodal speech-to-speech translation (2014) (0)
Spectral analysis of esophageal speech (1999) (0)
Incorporating Dialectal Features in Synthesized Speech using Voice Conversion Techniques (2018) (0)
Actor-identified Spatiotemporal Action Detection - Detecting Who Is Doing What in Videos (2022) (0)
Towards Machine Speech-to-speech Translation (2020) (0)
Finding intermediate entity between two examples on the web (2009) (0)
MAXIMU乱1 LIKELIHOOD SUCCESSIVE STATE SPLITTING ALGORITHl'vI FOR TIED-MIXTURE HMNET (2012) (0)
Positive Emotion Elicitation in an Example-Based Dialogue System (2018) (0)
Analysis of acoustic models trained on a large-scale Japanese speech database (2000) (0)
On Knowledge Distillation for Translating Erroneous Speech Transcriptions (2021) (0)
Speaker and Emotion Recognition of TV-Series Data Using Multimodal and Multitask Deep Learning (2019) (0)
Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution (2017) (0)
Response Selection of Emotional Expressions for Persuasive Dialog Systems (2016) (0)
English-Read-By-Japanese Speech Synthesis Preserving Speaker Individuality Based on Partial Correction of Prosody and Phonetic Sounds and Effects of English Proficiency Level on Its Performance (2015) (0)
Incremental unsupervised training for university lecture recognition (2013) (0)
Robust digital watermarks for audio signal. (1999) (0)
R-STEINER: Generation Method of 5'UTR for Increasing the Amount of Translated Proteins (2018) (0)
Speech Recognition Using GFIKS (2009) (0)
MODEL ADAPTATION APPROACH E-CMN/PMC BASED ON CEPSTRUM MEAN NORMALIZATION AND HMM COMPOSITION FOR SPEED RECOGNITION IN ADVERSE CAR ENVIRONMENTS (1997) (0)
A machine that learned to listen, speak, and listen while speaking (2020) (0)
Analysis of Emphasis on Japanese-English Bilingual Corpora (2014) (0)
Japanese Neural Incremental Text-to-Speech Synthesis Framework With an Accent Phrase Input (2023) (0)
A k-anonymized Text Generation Method (2017) (0)
Large Vocablary Cnotinuous Speech Recognition Performance in Car Environments for Various Phoneme Models (2000) (0)
Comparison of Effective Features and Analysis of Questions Towards Dialogue-based Deception Detection (2014) (0)
Hands - Free Speech Recognition in Real Environments (1999) (0)
Speech recognition features based on deep latent Gaussian models (2017) (0)
A Machine Speech Chain Approach for Dynamically Adaptive Lombard TTS in Static and Dynamic Noise Environments (2022) (0)
Entropy-based Adaptation of N-gram Language Models (1999) (0)
Effect of Dialogue Context and Topic Clustering on Out-of-Domain Detection (2005) (0)
Prediction of Depressive Tendency from Multidimensional Health Data Collected through Crowdsourcing (2018) (0)
Tackling Perception Bias in Unsupervised Phoneme Discovery Using DPGMM-RNN Hybrid Model and Functional Load (2021) (0)
Quality Improvement Approaches Based on the Modulation Spectrum to Statistical Parametric Speech Synthesis (2015) (0)
Temporal Asynchronicity Modeling by Product HMMS for Audio-Visual Speech Recognition (2002) (0)
Oriental-COCOSDA 2016: Japan country report (2016) (0)
Practical Lip-synch Tools for 3 D Cartoon Animation (2011) (0)
Iterative mapping function estimation and environment structure refinement in the online phase of the ESSEM approach (音声) (2011) (0)
Improving Spoken Language Understanding by Wisdom of Crowds (2020) (0)
An Evaluation of Articulatory Controllable Speech Modification based on Gaussian Mixture Models with Direct Waveform Modification (2015) (0)
Persuasive Dialog System Using Emotional Expressions (2018) (0)
Maximum likelihood sub-band weighting for robust speech recognition (2003) (0)
A Cross-sectional Study on the Association between the Constitution in Chinese Medicine and the risk factors of Lifestyle Diseases (2019) (0)
Relationship Between Coherence of Sequential Events and Dialogue Continuity in Conversational Response (2021) (0)
Visual Description Paraphrase Corpus Creation with Various Elementary Operations ∗ (2018) (0)
Advanced Acoustic Modeling with the Hybrid HMM/BN Framework (2004) (0)
Project Aiming to Be the Global Center for Speech and Language Research (0)
A-16-18 Key-frame Removal Method for Blendshapes-based Lip-Sync Animation (2007) (0)
Voice activity detection in a reguarized reproducing kernel hilbert space (2010) (0)
Keynote speech 3: Toward simultaneous, natural and multimodal speech-to-speech translation (2015) (0)
Key-frame removal method for blendshape-based cartoon lip-sync animation (2006) (0)
Error Selection Methods for Machine Translation Error Analysis (2016) (0)
Personalized Voice Assignment Techniques for Synchronized Scenario Speech Output in Entertainment Systems (2011) (0)
Outlier detection for acoustic model training using robust statistics (2005) (0)
Removing noise from event-related potentials using a probabilistic generative model with grouped covariance matrices. (2016) (0)
Automated social skills training with audiovisual information. (2016) (0)
Exploring CNN and DNN Bottleneck Features for Emotional Speech Recognition (2015) (0)
LONG - TERM EFFECT REMOVAL FOR NOISY SPEECH RECOGNITION (2000) (0)
Efficient Representation of Short-time Phase using Time-domain Smoothed Group Delay and its Evaluation (1997) (0)
Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation (2021) (0)
A Noise Robust Speech Detection Method by Audio Visual Information (2001) (0)
ASR Posterior-Based Loss for Multi-Task End-to-End Speech Translation (2021) (0)
Prosody reconstruction by rescaling fundamental frequency contours in order to synthesize communicative speech (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology") (2008) (0)
Trends of Learning Technology Standard (2001) (0)
A Continuous Space Rule Selection Model for Syntax-based Statistical Machine Translation (2016) (0)
Adaptation of model parameter by HMM composition and decomposition in reverberant environments speech recognition (1996) (0)
Robust Speech Recognition by Multiple Beamforming with Reflection Signal Equalization (2001) (0)
Integrated Simultaneous Localization and Recognition of Multiple Sound Sources based on 3-D N-best Search Method (2001) (0)
Pronunciation Variation Modeling in the Literature (2011) (0)
Online Learning of Bayes Risk-Based Optimization of Dialogue Management for Document Retrieval Systems with Speech Interface (2011) (0)
Semi-Automatic Colorization Pipeline for Anime Characters and its Evaluation in Production (2022) (0)
COMPACT RECURRENT NEURAL NETWORK BASED ON TENSOR-TRAIN FOR POLYPHONIC MUSIC MODELING (2017) (0)
Multiple Beamforming with CSP-Based Source Localization (2000) (0)
Hands-Free Word Recognition in Real Noise Enviromment (1999) (0)
Speech to Lip Synthesis by HMM (1997) (0)
Simultaneous Recognition of Distant-Talking Speech of Multiple Talkers Based on the 3-D N-Best Search Method (2004) (0)
A Study on Natural Expressive Speech: Automatic Memorable Spoken Quote Detection (2015) (0)
ADAPTATION OF MODEL PARAMETERS TO DISTORTED SPEECH IN ADVERSE ENVIRONMENTS BY HMM DE/COMPOSITION (1998) (0)
IMPROVING ACCURACY IN PARAMETER ESTIMATION IN AN EXTENDED KALMAN PARTICLE FILTERS FOR NOISY SPEECH RECOGNITION (2001) (0)
Japanese Spontaneous Spoken Document Retrieval Using NMF-Based Topic Models (2009) (0)
Model adaptation by HMM decomposition and composition in noisy reverberant environments (2000) (0)
Neural Machine Translation Models using Binarized Prediction and Error Correction (2018) (0)
Semi-blind algorithm for joint noise suppression and dereverberation based on higher-order statistics and acoustic model likelihood (2013) (0)
An evaluation of talker localization based on direction of arrival estimation and statistical sound source identification (2002) (0)
ReMOTS: Refining Multi-Object Tracking and Segmentation (1 place solution for MOTSChalelnge 2020 Track 1) (2020) (0)
The Network-based Multilingual ASR System Towards Multilingual Conversations in Medical Domain (2014) (0)
Ambient Browser: Web Browser for Daily Use (2005) (0)
Bottleneck Features for Emotional Speech Recognition (2015) (0)
Determination of Optimum Number of Groups on the Crowdsourcing Survey in Japanese People Interpreted by Physical Constitution Defined by CCMQ-J (2019) (0)
Personal photo browser that can classify photos by participants and situations (2012) (0)
Acoustic Features for Estimation of Perceptional Similarity (2007) (0)
Prosodical Analysis of Esophageal Speech (1997) (0)
Speech and Natural Language Processing in the Web Information Era (2011) (0)
Robust Sound Field Reproduction against Listener's Movement Utilizing Image Sensor (2014) (0)
Statistical approach to perceived age control of singing voice (2014) (0)
A close look into the probabilistic concatenation model for corpus-based speech synthesis (2009) (0)
An investigation of how to design control parameters for statistical voice timbre control (2017) (0)
Self-Adaptive Incremental Machine Speech Chain for Lombard TTS with High-Granularity ASR Feedback in Dynamic Noise Condition (2023) (0)
Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization (2014) (0)
Lip Motion Generation from Audio Signals based on Hidden Markov Models (1999) (0)
Unsupervised Unit Discovery and Multi-scale Code 2 Spec Inverter for Zerospeech Challenge 2019 (2019) (0)
Unifying Speech Recognition and Generation with Machine Speech Chain (2019) (0)
Optimal divergence diversity for superresolution-based nonnegative matrix factorization (2014) (0)
Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation (2021) (0)
Language Model Adaptation and Analysis for Individuality Transforming 水上雅博 (2014) (0)
Passive subtractive beamformer applied to line sound sources (2002) (0)
Probabilistic Enhancement of EEG Component Using Prior Distribution of Correlations Between Channels (2014) (0)
Unknown Word Detection Based on Event-Related Brain Desynchronization Responses (2015) (0)
Passive hybrid subtractive beamformer for near-field sound sources (2005) (0)
Syntactic Matching Methods in Pivot Translation (2018) (0)
Toward Entrained Response Generation for Neural Conversation Model (2020) (0)
Proceedings of the Second international conference on Spoken dialogue systems for ambient environments (2010) (0)
A Dialog System to Detect Deception (2015) (0)
Semi-Blind Optimization Scheme of Joint Suppression of Background Noise and Late Reverberation (2013) (0)
Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain (2021) (0)
Graphical Framework to Incorporate Knowledge Sources (2009) (0)
Fundamental Study of Color Combinations by Using Deuteranope-Simulation Filter for Controlling the Handicap of Color Vision Diversity in Video Games (2021) (0)
Multimodal translation (2001) (0)
Multimodal Database of Negative Emotion Recovery in Dyadic Interactions : Construction and Analysis (2018) (0)
Utilizing deception information for dialog management of doctor-patient conversations (2018) (0)
Feature Transfer Learning for Wav 2 Text Sequence-to-Sequence ASR (2018) (0)
Towards Speech Entrainment: Considering ASR Information in Speaking Rate Variation of TTS Waveform Generation (2020) (0)
Sightseeing Guidance Systems Based on WFST-Based Dialogue Manager (2010) (0)
Neural Incremental Speech Recognition Through Attention Transfer (2020) (0)
GENERALIZED POSTERIOR PROBABILITY FOR VERIFYING RECOGNIZED WORDS OPTIMALLY IN MICROPHONE ARRAY APPLICATIONS (2005) (0)
Robust Verification of Recognized Words in Noise Wai-Kit (2004) (0)
TRANS-AM: Discovery Method of Optimal Input Vectors Corresponding to Objective Variables (2018) (0)
SUITABLE DESIGN OF ADAPTI VE BEAMFORMER BASED ON AVERAGE SPEECH SPECT RUM FOR NOISY SPEECH RECOGNITION (2012) (0)
The 2ch hybrid subtractive beamformer applied to line sound sources (2002) (0)
Policy Reuse for Dialog Management Using Action-Relation Probability (2020) (0)
Multimodal Chain: Cross-Modal Collaboration Through Listening, Speaking, and Visualizing (2021) (0)
From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning (2019) (0)
Frame level likelihood transformations for ASR and utterance verification (2000) (0)
Corrections to "Machine Speech Chain" (2020) (0)
IJCNLP 2008 Workshop on Technologies and Corpora for Asia-Pacific Speech Translation ( TCAST ) Proceedings of the Workshop (2007) (0)
Speech Translation Systems: A Corpus-Based Approach (2012) (0)
CENSREC 2 : Corpus and Evalu In Car Continuous Digit Sp (2006) (0)
Word-level Emphasis Transfer in Speech-to-speech Translation | Article Information | J-GLOBAL (2016) (0)
Articulatory Controllable Speech Modification using Sequential Inversion and Production Mapping with Gaussian Mixture Models (音声) -- (第16回音声言語シンポジウム) (2014) (0)
Contents Conversion for Gradual Web Browsing (2005) (0)
An acoustic modeling robust to the speaking style variation by adaptive frame shift based on the difference of the spectrum in time domain. (2001) (0)
Evaluation for WFST-based dialog management (2009) (0)
Experimental Evaluation of Superresolution-Based Nonnegative Matrix Factorization for Binaural Recording (2014) (0)
Handling Non-native Speech (2011) (0)
Automatic head-movement control for emotional speech (2005) (0)
Real Environment Acoustic Database (2004) (0)
Out -of- Domain Detection Incorporating Dialogue Context and Topic Clustering (2004) (0)
Robustness in Speech Recognition - What is needed ? - (2009) (0)
Cover Final (2010) (0)
Lip-sync animation from HMM using dynamic features (2006) (0)
Recursive neural network paraphrase identification for example-based dialog retrieval (2014) (0)
Recognition and Analysis of Emotion in Indonesian Conversational Speech (音声) -- (第16回音声言語シンポジウム) (2014) (0)
Localization-Preserved Binaural Source Separation Using Channel-Wise Target Prior Estimation and Equi-Binaural Spectral Gain (2014) (0)
1. Trends of Speech-to-Speech Translation Technologies (2010) (0)
Unnecessary utterance detection for avoiding digressions in discussion (2014) (0)
Enhancing Neural Machine Translation with Image-based Paraphrase Augmentation (2019) (0)
An Unsupervised Model of Redundancy for Answer Validation (2010) (0)
Start Offer user ’ s action ? Framing Offer Hesitate RejectAccept Framing No Yes Lie ? Lie ? Yes End End Offer No Framing Question (2018) (0)
Introduction and Book Overview (2009) (0)
Evaluation of model adaptation by HMM decomposition on telephone speech recognition (1998) (0)
Design and Implementation of HMM/BN Acoustic Models (2004) (0)
Reversible display: content browsing with reverse operations in mobile computing environments (2005) (0)
An Investigation of Acoustic-to-Articulatory Inversion Mapping with Latent Trajectory Gaussian Mixture Model (2016) (0)
Web text classification for response generation in spoken decision support dialogue systems (2010) (0)
An Improvement to 3 - D N - best Search Using Path - Distance Based Clustering for Recognizing Multiple Sound Sources (1999) (0)
Post-recording tool for instant casting movie system (2008) (0)
Joint Noise Suppression and Dereverberation Combining Frequency-Domain Blind Signal Extraction and Multichannel Wiener Filter for Hands-Free Spoken Dialogue System (2013) (0)
Content Adaptation for Gradual Web Rendering (2005) (0)
WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition (2015) (0)
Machine Speech Chain with Deep Learning ∗ (2018) (0)
Blend of Browse and Search : Web browse with Autonomous Retrieve of Releted Information by Using Web Browsing History (2008) (0)
An Acoustic Modeling Method Robust against Changes of Speaking Style in Error Recovery (2004) (0)
Spoken Dialogue System for Persuasion Based on Multimodal Emotion Expressions (2020) (0)
Discovering intermediate entities from two examples by using web search engine indices (2010) (0)
Neural Machine Translation Improvement by Acoustic Embedding (2020) (0)
Lecture speech recognition considering the speaking rate variation (2001) (0)
Sequence-Based Pronunciation Modeling Using a Noisy-Channel Approach (2010) (0)
Incorporating Discriminative DPGMM Posteriorgrams for Low-Resource ASR (2021) (0)
RerankEverything: a reranking interface for exploring search results (2011) (0)
Selective Attention Measurement of Experienced Simultaneous Interpreters Using EEG Phase-Locked Response (2021) (0)
EnergyBrowser: Web browser for exercise (2005) (0)
Progress Report of SLP Working Group for Noisy Speech Recognition (2002) (0)
Semi-supervised Learning by Machine Speech Chain for Multilingual Speech Processing, and Recent Progress on Automatic Speech Interpretation (2019) (0)
Developments of Anthropomorphic Dialog Agent : A Plan and Development and its Significance (2001) (0)
Maximum likelihood successive state splitting algorithm for tied-mixture HMNET (1997) (0)
Image Manipulation with Unconstrained Natural Language Instruction using Source Image Masking (2018) (0)
Recognition and Translation of Japanese-English Code-switching Speech for Monolingual Speakers (2020) (0)
Experimental Evaluation of Postfilter-Based Nonnegative Matrix Factorization with Statistical Model Parameter Estimation (2014) (0)
Message of the O-COCOSDA Convener (2014) (0)
F 0 Contour Generation Using Rich Context Models in HMM-Based Speech Synthesis (2013) (0)
PLUM: A Photograph Browser with a Layout-Upon-Maps Algorithm (2016) (0)
PREDICTION FOR ELECTROLARYNGEAL SPEECH ENHANCEMENT CONSIDERING GENERATIVE PROCESS OF F 0 CONTOURS WITHIN PRODUCT OF EXPERTS FRAMEWORK (2016) (0)
Weakly-Supervised Speech-to-Text Mapping with Visually Connected Non-Parallel Speech-Text Data Using Cyclic Partially-Aligned Transformer (2021) (0)
因果関係を用いた雑談対話応答のリランキング Incorporating Event Causality to Re-ranking for Conversational Dialogue Responses (2018) (0)
Eurospeech 2001 -scandinavia Noise Reduction Using Paired-microphones for Both Far-eld and Near-eld Sound Sources (2001) (0)
Linguistic Features during Speech Utterances in the Context of Social Skills Training (2020) (0)
Eye Gaze-based Unknown Word Detection in Non-native Language Reading using SVMs and Random Forests (2016) (0)
MIXTURE OF FACTOR ANALYZED HMM (2002) (0)
A Dialog System with Human-to-Human Conversation Example (2014) (0)
Properties of Non-native Speech (2011) (0)
A Speech Translation System Using Mobile Devices and a Field Experiment (2005) (0)
reproducing kernel Hilbert space (2010) (0)
Affect-sensitive Dialogue Response Generation for Positive Emotion Elicitation (2019) (0)
Special Section on Statistical Modeling for Speech Processing (2006) (0)
Word and Dialogue Act Entrainment Analysis based on User Profile (2016) (0)
Complimentary Combination of Microphone Array and HMM Composition for Noisy Speech Recognition (2001) (0)
The NAIST English speech recognition system for IWSLT 2013 (2013) (0)
High arousal Low arousal Positive valence Negative valence Tense Nervous Stressed Upset Alert Excited Elated Happy Calm Relaxed Serene Content Fatigued Lethargic Depressed Sad (2019) (0)
On Oriental COCOSDA: East-Asian Chapter of International Coordinating Committee on Speech Databases and Speech Input/Output Systems Assessment (2007) (0)
A sampling-based environment population projection approach for rapid acoustic model adaptation (2011) (0)
Feature Inference Based on Label Propagation on Wikidata Graph for DST (2017) (0)
Distilling Knowledge from a Multi-scale Deep CNN Ensemble for Robust and Light-weight Acoustic Modeling (2018) (0)
A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources (2002) (0)
Constituency Parsing by Cross-Lingual Delexicalization (2021) (0)

This paper list is powered by the following services:

Other Resources About Satoshi Nakamura

What Schools Are Affiliated With Satoshi Nakamura?

Satoshi Nakamura is affiliated with the following schools: