Why not use markers when the duration of the audio exceed 30s? #12

ILG2021 · 2023-05-19T06:26:01Z

No description provided.

EtienneAb3d · 2023-05-19T07:06:36Z

@ILG2021
Two reasons:

to get a good marker recognition, the "prompt" texts are very useful. But, Whisper is processing audio files by segments of 30s. The "prompt" is used only on the first 30s. After 30s, without the "prompt" information, it is really less efficient to produce good marker outputs.
in my tests, when using noise and silence removal, it is sufficient to produce quite good results on files larger than 30s, without hallucination (and not sufficient on smaller files).

ILG2021 · 2023-05-19T14:47:08Z

Yes, WhisperHallu is a very good solution for the silence sound which cause youtube ads. But it seem can not solve the repeat sentence problem. I have a Chinese fine tune faster whisper and when I speak two similar sentence, the model will produce repeat sentence. I don't know why. The result is：你可以保存，保存在你的手机上或者是笔记本电脑上，这样你可以直接找一些朋友，如果说在你非常忙的时候你没有时间跟他们一句一句聊天，你要很长时间的话，你可以直接发送这些短语，这样的话会让他们看见你真贴心，让他们夸赞你真贴心，让他们看你真贴心，让他们夸赞你真贴心，让他们看见你真贴心，让他们夸赞你真贴心，让他们看见你真贴心，让他们夸赞你真贴心，让他们看见你真贴心，让他们夸赞你真贴心，让他们看见你

ILG2021 changed the title ~~why not use marker when the duration exceed 30s?~~ Why not use markers when the duration of the audio exceed 30s? May 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not use markers when the duration of the audio exceed 30s? #12

Why not use markers when the duration of the audio exceed 30s? #12

ILG2021 commented May 19, 2023

EtienneAb3d commented May 19, 2023

ILG2021 commented May 19, 2023 •

edited

Loading

Why not use markers when the duration of the audio exceed 30s? #12

Why not use markers when the duration of the audio exceed 30s? #12

Comments

ILG2021 commented May 19, 2023

EtienneAb3d commented May 19, 2023

ILG2021 commented May 19, 2023 • edited Loading

ILG2021 commented May 19, 2023 •

edited

Loading