This is a repository for abstracts of international conference papers related to text-to-speech and voice conversion.
@supikiti, @YotaUedaa
- TRANSFORMER-BASED TEXT-TO-SPEECH WITH WEIGHTED FORCED ATTENTION
- SCALABLE MULTILINGUAL FRONTEND FOR TTS
- GERATING DIVERSE AND NATURAL TEXT-TO-SPEECH SAMPLES USING A QUANTIZED FINE-GRAINED VAE AND AUTOREGRESSIVE PROSODY PRIOR
- An Effective Style Token Weight Control Technique for End-to-End Emotional Speech Synthesis
- IMPROVING END-TO-END SPEECH SYNTHESIS WITH LOCAL RECURRENT NEURAL NETWORK ENHANCED TRANSFORMER
- TEACHER-STUDENT TRAINING FOR ROBUST TACOTRON-BASED TTS
- FULLY-HIERARCHICAL FINE-GRAINED PROSODY MODELING FOR INTERRETABLE SPEECH SYNTHESIS
- TRANSFERRING NEURAL SPEECH WAVEFORM SYNTHESIZERS TO MUSICAL
- F0-CONSYSTENT MANY-TO-MANY VOICE CONVERSION VIA CONDITIONAL AUTOENCODER
- Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks
- END-TO-END ACCENT CONVERSION WITHOUT USING NATIVE UTTERANCES
- One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization
- Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data
- Scalable Factorized Hierarchical Variational Autoencoder Training
- Investigation of using disentangled and interpretable representations for one-shot cross-lingual voice conversion
- EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION
- MTGAN-Speaker_Verification_through_Multitasking_Triplet_Generative_Adversarial_Network