Hung-yi Lee

Hung-yi Lee's AcademicInfluence.com Rankings

Hung-yi Lee

Computer Science

#9352

World Rank

#9825

Historical Rank

Algorithms

#378

World Rank

#383

Historical Rank

Computational Linguistics

#2243

World Rank

#2266

Historical Rank

Machine Learning

#4069

World Rank

#4118

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Hung-yi Lee's Degrees

PhD Electrical Engineering National Taiwan University
Masters Electrical Engineering National Taiwan University
Bachelors Electrical Engineering National Taiwan University

Why Is Hung-yi Lee Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Hung-yi Lee's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Dependency Parsing (2011) (347)
Temporal pattern attention for multivariate time series forecasting (2018) (336)
SUPERB: Speech processing Universal PERformance Benchmark (2021) (294)
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders (2019) (242)
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech (2020) (200)
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder (2016) (166)
One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization (2019) (142)
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations (2018) (115)
Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval (2015) (102)
Audio Albert: A Lite Bert for Self-Supervised Learning of Audio Representation (2020) (98)
Tree Transformer: Integrating Tree Structures into Self-Attention (2019) (97)
LAMOL: LAnguage MOdeling for Lifelong Language Learning (2019) (93)
Learning Chinese Word Representations From Glyphs Of Characters (2017) (75)
Supervised and Unsupervised Transfer Learning for Question Answering (2017) (73)
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection (2016) (61)
Meta Learning for End-To-End Low-Resource Speech Recognition (2019) (60)
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning (2019) (59)
SpeechBERT: An Audio-and-Text Jointly Learned Language Model for End-to-End Spoken Question Answering (2019) (58)
Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension (2018) (58)
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture (2020) (57)
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks (2018) (56)
Again-VC: A One-Shot Voice Conversion Using Activation Guidance and Adaptive Instance Normalization (2020) (51)
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs (2019) (50)
Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert (2021) (47)
Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation (2018) (46)
Self-Supervised Speech Representation Learning: A Review (2022) (46)
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection (2018) (45)
Fragmentvc: Any-To-Any Voice Conversion by End-To-End Extracting and Fusing Fine-Grained Voice Fragments with Attention (2020) (43)
Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation (2018) (43)
One-Shot Voice Conversion by Vector Quantization (2020) (43)
Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine (2016) (43)
ODSQA: Open-Domain Spoken Question Answering Dataset (2018) (42)
Adversarial Attacks on Spoofing Countermeasures of Automatic Speaker Verification (2019) (41)
Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model (2019) (41)
Towards Audio to Scene Image Synthesis Using Generative Adversarial Network (2018) (41)
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model (2018) (40)
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and its Correlation with Phoneme Boundaries (2017) (39)
Noise Adaptive Speech Enhancement using Domain Adversarial Training (2018) (37)
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning (2019) (37)
Recurrent neural network based language model personalization by social network crowdsourcing (2013) (36)
Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Recurrent Neural Networks (2016) (35)
SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering (2019) (34)
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining (2020) (34)
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech (2021) (34)
Defense Against Adversarial Attacks on Spoofing Countermeasures of ASV (2020) (34)
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings (2018) (31)
Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models (2021) (29)
Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval (2018) (29)
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities (2022) (29)
Utilizing Self-supervised Representations for MOS Prediction (2021) (28)
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process (2012) (27)
Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples (2013) (27)
Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition (2013) (27)
Scalable Sentiment for Sequence-to-Sequence Chatbot Response with Performance Analysis (2018) (26)
Improved spoken term detection with graph-based re-ranking in feature space (2011) (26)
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation (2020) (26)
Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation (2019) (25)
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations (2021) (25)
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition (2021) (25)
Improved spoken term detection by feature space pseudo-relevance feedback (2010) (25)
Defending Your Voice: Adversarial Attack on Voice Conversion (2020) (24)
Deep Long Audio Inpainting (2019) (23)
Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech (2018) (23)
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis (2020) (23)
Personalizing Recurrent-Neural-Network-Based Language Model by Social Network (2017) (22)
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification (2022) (21)
Towards Robust Neural Vocoding for Speech Generation: A Survey (2019) (21)
Towards End-to-end Speech-to-text Translation with Two-pass Decoding (2019) (21)
Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning (2020) (21)
Pretrained Language Model Embryology: The Birth of ALBERT (2020) (20)
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering (2019) (20)
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion (2019) (20)
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages (2014) (20)
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning (2021) (19)
LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning (2019) (19)
Understanding Self-Attention of Self-Supervised Audio Transformers (2020) (19)
Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning (2014) (19)
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only (2018) (18)
Improving Unsupervised Style Transfer in end-to-end Speech Synthesis with end-to-end Speech Recognition (2018) (18)
Improved spoken term detection using support vector machines with acoustic and context features from pseudo-relevance feedback (2011) (18)
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation (2019) (18)
Rhythm-Flexible Voice Conversion Without Parallel Data Using Cycle-GAN Over Phoneme Posteriorgram Sequences (2018) (18)
Towards Lifelong Learning of End-to-end ASR (2021) (18)
Query-by-Example Spoken Term Detection Using Attention-Based Multi-Hop Networks (2017) (17)
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data (2018) (17)
Voting for the right answer: Adversarial defense for speaker verification (2021) (16)
What Does a Network Layer Hear? Analyzing Hidden Representations of End-to-End ASR Through Speech Synthesis (2019) (15)
Order-free Learning Alleviating Exposure Bias in Multi-label Classification (2019) (15)
Cross-Lingual Transfer Learning for Question Answering (2019) (15)
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation (2020) (14)
Mitigating the impact of speech recognition errors on chatbot using sequence-to-sequence model (2017) (14)
Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs (2014) (14)
Hierarchical attention model for improved machine comprehension of spoken content (2016) (13)
Meta Learning for Natural Language Processing: A Survey (2022) (13)
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech (2021) (13)
Self-Supervised Deep Learning for Fisheye Image Rectification (2020) (13)
Bound States of Dispersion-Managed Solitons From Single-Mode Yb-Doped Fiber Laser at Net-Normal Dispersion (2015) (13)
Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity (2012) (13)
Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph (2012) (13)
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations (2021) (13)
Improved lattice-based spoken document retrieval by directly learning from the evaluation measures (2009) (13)
Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models (2019) (13)
Improved spoken term detection using support vector machines based on lattice context consistency (2011) (13)
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU (2020) (12)
Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training (2020) (12)
Personalized Dialogue Response Generation Learned from Monologues (2019) (12)
Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models (2019) (12)
Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns (2013) (12)
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets (2021) (12)
Interrupted and Cascaded Permutation Invariant Training for Speech Separation (2019) (11)
An iterative deep learning framework for unsupervised discovery of speech features and linguistic units with applications on spoken term detection (2015) (11)
Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation (2021) (11)
Recognition of highly imbalanced code-mixed bilingual speech with frame-level language detection based on blurred posteriorgram (2012) (11)
Structuring lectures in massive open online courses (MOOCs) for efficient learning by linking similar sections and predicting prerequisites (2015) (11)
Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications (2012) (11)
Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity (2014) (11)
Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection (2012) (11)
Spoken question answering using tree-structured conditional random fields and two-layer random walk (2014) (11)
Interactive spoken content retrieval by extended query model and continuous state space Markov Decision Process (2013) (11)
Unsupervised two-stage keyword extraction from spoken documents by topic coherence and support vector machine (2012) (11)
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks (2022) (10)
Integrating recognition and retrieval with user feedback: A new framework for spoken term detection (2010) (10)
Interactive Spoken Content Retrieval by Deep Reinforcement Learning (2016) (10)
Towards unsupervised semantic retrieval of spoken content with query expansion based on automatically discovered acoustic patterns (2013) (10)
A Study of Cross-Lingual Ability and Language-specific Information in Multilingual BERT (2020) (9)
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation (2022) (9)
How Far Are We from Robust Voice Conversion: A Survey (2020) (9)
Analyzing The Robustness of Unsupervised Speech Recognition (2021) (9)
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery (2022) (9)
Detection of Oral Dysplastic and Early Cancerous Lesions by Polarization-Sensitive Optical Coherence Tomography (2020) (9)
Sequence-to-Sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding (2019) (8)
Spotting adversarial samples for speaker verification by neural vocoders (2021) (8)
Training Code-Switching Language Model with Monolingual Data (2019) (8)
Recurrent Neural Network Based Personalized Language Modeling by Social Network Crowdsourcing (2013) (8)
Towards structured deep neural network for automatic speech recognition (2015) (8)
Pre-Training a Language Model Without Human Language (2020) (8)
Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets (2019) (8)
Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for words and subword units (2009) (8)
Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD (2019) (7)
Don't Speak Too Fast: The Impact of Data Bias on Self-Supervised Speech Models (2021) (7)
Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback (2010) (7)
Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing (2015) (7)
Query-based Attention CNN for Text Similarity Map (2017) (7)
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning (2021) (7)
AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks (2022) (7)
Non-Autoregressive Mandarin-English Code-Switching Speech Recognition (2021) (6)
Improved Speech Summarization and Spoken Term Detection with Graphical Analysis of Utterance Similarities (2011) (6)
Alignment of spoken utterances with slide content for easier learning with recorded lectures using structured support vector machine (SVM) (2014) (6)
Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process (2012) (6)
Further Boosting BERT-based Models by Duplicating Existing Layers: Some Intriguing Phenomena inside BERT (2020) (6)
Meta Learning and Its Applications to Natural Language Processing (2021) (6)
Abstractive headline generation for spoken content by attentive recurrent neural networks with ASR error modeling (2016) (6)
Semantic retrieval of personal photos using a deep autoencoder fusing visual features with speech annotations represented as word/paragraph vectors (2015) (6)
Seeing and hearing too: Audio representation for video captioning (2017) (6)
Mitigating Biases in Toxic Language Detection through Invariant Rationalization (2021) (6)
Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification (2018) (6)
A framework integrating different relevance feedback scenarios and approaches for spoken term detection (2010) (5)
Generative Adversarial Network and its Applications to Speech Signal and Natural Language Processing (2018) (5)
SpeechNet: A Universal Modularized Model for Speech Processing Tasks (2021) (5)
Proximal Policy Optimization and its Dynamic Version for Sequence Generation (2018) (5)
Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning (2021) (5)
One Shot Learning for Speech Separation (2020) (5)
Semantic query expansion and context-based discriminative term modeling for spoken document retrieval (2012) (5)
Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine (2012) (5)
Utterance-level latent topic transition modeling for spoken documents and its application in automatic summarization (2012) (5)
Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variables (2013) (5)
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations Without Target Language Data (2017) (5)
Temporal pattern attention for multivariate time series forecasting (2019) (5)
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability (2021) (5)
End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training (2020) (5)
Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation (2020) (5)
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion (2022) (5)
Unsupervised domain adaptation for spoken document summarization with structured support vector machine (2013) (5)
The Ability of Self-Supervised Speech Models for Audio Representations (2022) (5)
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection (2018) (5)
Attention-based Memory Selection Recurrent Network for Language Modeling (2016) (4)
CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement Learning (2021) (4)
On Compressing Sequences for Self-Supervised Speech Models (2022) (4)
Lifting motion planning for humanoid robots (2014) (4)
Superb @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning (2022) (4)
Personalized word representations carrying personalized semantics learned from social network posts (2017) (3)
Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification (2017) (3)
DUAL: Textless Spoken Question Answering with Speech Discrete Unit Adaptive Learning (2022) (3)
Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices (2013) (3)
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model (2022) (3)
Exploring Efficient-Tuning Methods in Self-Supervised Speech Models (2022) (3)
Re-Examining Human Annotations for Interpretable NLP (2022) (3)
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation (2022) (3)
Adversarial Sample Detection for Speaker Verification by Neural Vocoders (2021) (3)
What makes multilingual BERT multilingual? (2020) (3)
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation (2020) (3)
TopicGAN: Unsupervised Text Generation from Explainable Latent Topics (2018) (3)
Investigation of Sentiment Controllable Chatbot (2020) (3)
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network (2022) (3)
A Adversarial Attack (2021) (3)
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models (2021) (3)
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines (2021) (2)
Improving Automatic Speech Recognition and Speech Translation via Word Embedding Prediction (2021) (2)
Classical and Quantum Mechanics with Poincare-Snyder Relativity (2010) (2)
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information (2022) (2)
Language Representation in Multilingual BERT and its applications to improve Cross-lingual Generalization (2020) (2)
Anticipation-Free Training for Simultaneous Machine Translation (2022) (2)
Self-supervised Representation Learning for Speech Processing (2022) (2)
Spoofing-Aware Speaker Verification by Multi-Level Fusion (2022) (2)
Semantic retrieval of personal photos using matrix factorization and two-layer random walk fusing sparse speech annotations with visual features (2014) (2)
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering (2022) (2)
SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning (2020) (2)
MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing (2019) (2)
An initial attempt to improve spoken term detection by learning optimal weights for different indexing features (2010) (2)
Put Chatbot into Its Interlocutor’s Shoes: New Framework to Learn Chatbot Responding with Intention (2021) (2)
Membership Inference Attacks Against Self-supervised Speech Models (2021) (2)
Recurrent Neural Network based language modeling with controllable external Memory (2017) (2)
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings (2022) (2)
Interactive Spoken Content Retrieval by Deep Reinforcement Learning (2018) (2)
Multi-accent Speech Separation with One Shot Learning (2021) (2)
TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation (2020) (2)
Attention-based CNN Matching Net (2017) (2)
Toward Degradation-Robust Voice Conversion (2021) (2)
Poincare-Snyder Relativity with Quantization (2009) (2)
Personalized acoustic modeling by weakly supervised multi-task deep learning using acoustic tokens discovered from unlabeled data (2017) (2)
On the Efficiency of Integrating Self-Supervised Learning and Meta-Learning for User-Defined Few-Shot Keyword Spotting (2022) (1)
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5 (2022) (1)
Using Deep-Q Network to Select Candidates from N-best Speech Recognition Hypotheses for Enhancing Dialogue State Tracking (2019) (1)
Structured Prompt Tuning (2022) (1)
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores (2022) (1)
Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer (2020) (1)
Parallel Synthesis for Autoregressive Speech Generation (2022) (1)
Adversarial Defense for Automatic Speaker Verification by Self-Supervised Learning (2021) (1)
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning (2020) (1)
Looking for Clues of Language in Multilingual BERT to Improve Cross-lingual Generalization (2020) (1)
Compressing Transformer-based self-supervised models for speech processing (2022) (1)
Model Extraction Attack against Self-supervised Speech Models (2022) (1)
Improving Cross-Lingual Reading Comprehension with Self-Training (2021) (1)
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks (2022) (1)
MelHuBERT: A simplified HuBERT on Mel spectrogram (2022) (1)
Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator (2018) (1)
Once-for-All Sequence Compression for Self-Supervised Speech Models (2022) (1)
Domain Independent Key Term Extraction from Spoken Content Based on Context and Term Location Information in the Utterances (2018) (1)
J-Net: Randomly weighted U-Net for audio source separation (2019) (1)
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding (2022) (1)
On the Utility of Self-Supervised Models for Prosody-Related Tasks (2022) (1)
Multiple pulses and harmonic mode locking from passive mode-locked Ytterbium doped fiber in anomalous dispersion region (2015) (1)
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition (2022) (1)
Few-shot Prompting Towards Controllable Response Generation (2022) (1)
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs (2023) (1)
O ct 2 01 9 INTERRUPTED AND CASCADED PERMUTATION INVARIANT TRAINING FOR SPEECH SEPARATION Gene-Ping (2019) (0)
General Framework for Self-Supervised Model Priming for Parameter-Efficient Fine-tuning (2022) (0)
Improving the transferability of speech separation by meta-learning (2022) (0)
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis (2022) (0)
Personalizing a Universal Recurrent Neural Network Language Model with User Characteristic Features by Crowdsouring over Social Networks (2015) (0)
Multi-modal User Intent Classification Under the Scenario of Smart Factory (Student Abstract) (2021) (0)
Anticipation-free Training for Simultaneous Translation (2022) (0)
Editorial Editorial of Special Issue on Self-Supervised Learning for Speech and Audio Processing (2022) (0)
Unsupervised Multiple Choices Question Answering: Start Learning from Basic Knowledge (2020) (0)
Unsupervised Deep Learning based Multiple Choices Question Answering: Start Learning from Basic Knowledge (2020) (0)
C L ] 3 M ay 2 02 1 SUPERB : Speech processing Universal PERformance Benchmark (2021) (0)
The Efficacy of Self-Supervised Speech Models for Audio Representations (2022) (0)
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding (2022) (0)
Can Large Language Models Be an Alternative to Human Evaluations? (2023) (0)
Personalized speech recognizer with keyword-based personalized lexicon and language model using word vector representations (2015) (0)
Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning (2022) (0)
Through the Lens of Neural Network: Analyzing Neural QA Models via Quantized Latent Representation (2019) (0)
Parallelized Reverse Curriculum Generation (2021) (0)
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences (2023) (0)
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection (2022) (0)
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings (2019) (0)
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers (2020) (0)
Characterizing the Fusion Strategies for Spoofing-Aware Speaker Verification (2022) (0)
EURO: ESPnet Unsupervised ASR Open-source Toolkit (2022) (0)
Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work (2022) (0)
Retrieval with Different Types of Actions Optimized by a Markov Decision Process (2012) (0)
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval (2022) (0)
Teaching Machine How to Think by Natural Language: A study on Machine Reading Comprehension (2018) (0)
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning (2023) (0)
Guest Editorial Special Issue on Adversarial Learning in Computational Intelligence (2020) (0)
Learning Interpretable and Discrete Representations with Adversarial Training for Unsupervised Text Classification (2020) (0)
Searching for the Essence of Adversarial Perturbations (2022) (0)
Introducing Semantics into Speech Encoders (2022) (0)
Adversarial Rap Lyric Generation (2022) (0)
Multimodal Transformer Distillation for Audio-Visual Synchronization (2022) (0)
Understanding, Detecting, and Separating Out-of-Distribution Samples and Adversarial Samples in Text Classification (2022) (0)
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models (2022) (0)
A Fully Integrated 1.7mW Attention-Based Automatic Speech Recognition Processor (2022) (0)
Position Matters! Empirical Study of Order Effect in Knowledge-grounded Dialogue (2023) (0)
Spoken Content Retrieval by Deep Reinforcement Learning (2016) (0)
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks (2023) (0)
Ensemble knowledge distillation of self-supervised speech models (2023) (0)
Jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine (2012) (0)
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR (2022) (0)
A Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) for Unsupervised Discovery of Linguistic Units and Generation of High Quality Features (2015) (0)
SUPERB: Speech Understanding and PERformance Benchmark (2021) (0)
Exploring Continuous Integrate-and-Fire for Efficient and Adaptive Simultaneous Speech Translation (2022) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Hung-yi Lee?

Hung-yi Lee is affiliated with the following schools:

National Taiwan University

Hung-yi Lee's Academic­Influence.com Rankings

Hung-yi Lee's Degrees

Why Is Hung-yi Lee Influential?

Hung-yi Lee's Published Works

Published Works

What Schools Are Affiliated With Hung-yi Lee?

Hung-yi Lee's AcademicInfluence.com Rankings