Is guiding the pronunciation possible? #488

S-T-K · 2025-01-10T11:23:50Z

S-T-K
Jan 10, 2025

Is there a way to influence the pronunciation, pacing, and emotion in the TTS output?
For instance, in ElevenLabs, placing quotation marks around a word can create stronger emphasis. The only methods I’ve found to actively control pacing involve using punctuation marks (e.g., . , ; : ? !) or adding ellipses or dashes for pauses, see https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-tricks-to-get-the-model-to-say-things-correctly
Any other adjustments appear to be ignored.

finefin · 2025-01-17T13:05:36Z

finefin
Jan 17, 2025

That depends on the model. A new model was released a few days ago that seems to be able to do what you ask for:
https://huggingface.co/OuteAI/OuteTTS-0.3-1B

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is guiding the pronunciation possible? #488

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Is guiding the pronunciation possible? #488

S-T-K Jan 10, 2025

Replies: 1 comment

finefin Jan 17, 2025

S-T-K
Jan 10, 2025

finefin
Jan 17, 2025