Skip to content

Releases: Huanshere/VideoLingo

v2.1.2

13 Dec 02:12
Compare
Choose a tag to compare

added gemini-2.0-flash-exp. End of story.

加入 gemini-2.0-flash-exp , 比赛结束。

v2.1.1

10 Dec 09:00
Compare
Choose a tag to compare

Release Notes

🐛 Bug Fixes:

  • Use empty audio as a fallback for failed TTS tasks and handle empty lines in TTS tasks generation.
  • Handle multiple spaces when merging words and allow multiple splits in second segmentation.
  • Add support for Grok beta and resolve compatibility issues in the askgpt function.
  • Pin ctranslate2 to version 4.5.0 to avoid errors.

发布说明

🐛 问题修复:

  • 为失败的 TTS 任务使用空音频作为回退,并处理 TTS 任务生成中的空行。
  • 在合并单词时处理多个空格,并允许在第二次分段中进行多次分割。
  • 支持 Grok beta 并解决 askgpt 函数中的兼容性问题。
  • 将 ctranslate2 固定为 4.5.0 版本,以避免报错。

v2.1.0

05 Dec 06:45
Compare
Choose a tag to compare

Release Notes

🚀 New Features:

  • Added support for custom terms.
  • Added a custom TTS setting.
  • Added support for Deepseekcoder.
  • Added support for pure local operation using Ollama and Edge-TTS.
  • Unified TTS methods with 302ai integration, now requiring only one 302ai key to experience the full functionality.

🐛 Bug Fixes:

  • Fixed the NVIDIA GPU check.
  • Added a check for [br] in the response splitting step.
  • Avoided errors by not checking the source audio bit rate.

🔧 Improvements:

  • Removed automatic FFmpeg installation, now requiring a system-level install.

🚀 新功能:

  • 增加了自定义术语的支持。
  • 增加了自定义 TTS 设置。
  • 增加了对 Deepseekcoder 的支持。
  • 增加了对纯本地运行的支持,使用 Ollama 和 Edge-TTS。
  • 与 302ai 集成统一了 TTS 方法,现在只需要一个 302ai 密钥就能体验完整功能。

🐛 问题修复:

  • 修复了 NVIDIA GPU 检查。
  • 在响应拆分步骤中增加了对 [br] 的检查。
  • 通过不检查源音频比特率避免了错误。

🔧 改进:

  • 移除了自动 FFmpeg 安装,现在需要系统级安装。

v2.0.4

03 Dec 10:14
Compare
Choose a tag to compare

Release Notes

🚀 New Features:

  • Added Demucs configuration in the Streamlit sidebar and increased human voice volume after Demucs processing, recommended for videos with loud background music.
  • Added memory cleanup after Demucs and improved Demucs audio quality.

🐛 Bug Fixes:

  • Reduced subtitle font size to prevent double-line subtitles.
  • Fixed the "All same length" error. Note: This issue may still occur with smaller models or reverse API usage.
  • Added handling for exceeding dubbing length limits, now truncating instead of throwing an error.

🔧 Improvements:

  • Optimized prompt splitting to handle repeated text.
  • Improved language style and removed the concise requirement from translation prompts.
  • Updated install.py to rollback to the v2.0 installation method, removing local third-party installation to avoid errors.
  • Installation now directly installs whisperX and Demucs from Git, avoiding dependency conflicts.

🚀 新功能:

  • 在 Streamlit 侧边栏中添加了 Demucs 配置,并在 Demucs 处理后增加了人声音量,适用于背景音乐较大的视频。
  • 在 Demucs 处理后添加了内存清理,并提升了 Demucs 音质。

🐛 问题修复:

  • 减小了字幕字体大小,防止出现双行字幕。
  • 修复了 "All same length" 错误。注意:使用较小的模型或逆向接口时仍可能报错。
  • 增加了配音长度超限的处理,现在会截断而不是抛出错误。

🔧 改进:

  • 优化了 prompt 分割,处理重复文本。
  • 优化了翻译 prompt 的语言风格并移除了简洁要求。
  • 更新了 install.py,回滚到 v2.0 安装方法,移除了本地第三方安装,避免错误。
  • 安装现在直接从 Git 安装 whisperX 和 Demucs,避免了依赖冲突。

v2.0.3

02 Dec 04:19
dd7ac06
Compare
Choose a tag to compare

Release Notes

December 2, 2024

🐛 Bug Fixes:

  • Fixed an alignment error in the voice generation task.
  • Resolved an audio reading issue during the whisper process.
  • Loosened the audio speed change error tolerance to reduce errors.
  • Applied stricter JSON constraints.

🔧 Improvements:

  • Added memory cleanup in batch mode to improve performance.

2024年12月2日

🐛 问题修复:

  • 修复了生成配音任务的对齐错误。
  • 解决了 whisper 过程中的读取音频问题。
  • 放宽了音频速度变化的误差容限,以减少错误。
  • 应用了更严格的 JSON 约束。

🔧 改进:

  • 在批量模式中添加了内存清理,以提高性能。

v2.0.2-deprecated

01 Dec 08:18
dd7ac06
Compare
Choose a tag to compare

Release Notes

🐛 Bug Fixes:

  • Removed the one-click installation script due to instability on some computers.
  • Removed conda installation of FFmpeg due to issues on Windows.
  • Fixed issues with gptsovits English support.
  • Reverted the local installation of FFmpeg.

🔧 Improvements:

  • Set the audio bitrate to 32000 to improve recognition accuracy.

🐛 问题修复:

  • 移除了一键安装脚本,因为在某些电脑上表现不稳定。
  • 移除了 conda 安装 FFmpeg,因为在 Windows 上无法正常使用。
  • 修复了 gptsovits 英文支持的问题。
  • 回滚了 FFmpeg 的本地安装。

🔧 改进:

  • 将音频比特率设置为 32000 以提高识别精度。

v2.0.1-deprecated

27 Nov 09:20
dd7ac06
Compare
Choose a tag to compare

🔧 Improvements:

  • Simplify installation process: Simplified the installation process. Now, Windows users no longer need to manually install CUDA, and the installation of FFmpeg and whisperX has been streamlined.
  • Optimized minor issues: Various small issues have been optimized for better performance and user experience.

🔧 改进:

  • 简化安装过程:简化了安装过程。现在,Windows 用户不再需要手动安装 CUDA,FFmpeg 和 whisperX 的安装也进行了简化。
  • 优化了一些小问题:对一些小问题进行了优化,以提高性能和用户体验。

v2.0

17 Nov 09:02
495e407
Compare
Choose a tag to compare

V 2.0

🚀 New Features:

  • The default LLM has been switched to SiliconFlow's Qwen-72B, which is exceptionally useful!
  • Integrated SiliconFlow's Fish TTS cloning voiceover feature, making VideoLingo accessible with just one key!
  • Solved the inconsistency issue between dubbing and subtitles by using on-screen subtitles for dubbing tasks!
  • Introduced OneKeyInstall to simplify the installation process, so there's no need to understand coding anymore!

🐛 Bug Fixes:

  • Fixed a critical error causing subtitles to become misaligned after multiple splits.
  • Resolved the issue with Rich box import.
  • Fixed FFmpeg error.
  • Replaced MoviePy due to an error it caused.

🔧 Improvements:

  • Enhanced code readability.
  • Added checks for Hugging Face's mirror sites.
  • Modified the audio extraction method to retain the original audio, compressing only during Whisper processing to improve audio quality.
  • Batch execution of TTS.

V 2.0

🚀 新功能:

  • 将默认 LLM 切换为硅基流动的 Qwen-72B,超好用!
  • 集成了硅基流动的 Fish TTS 克隆配音功能,现在一个 Key 就能畅行 VideoLingo!
  • 通过使用显示的字幕进行配音任务,解决了配音和字幕之间的不一致问题!
  • 引入了 OneKeyInstall 简化安装流程,再也不需要懂代码了!

🐛 问题修复:

  • 修复了导致多次分割后字幕错位的严重错误。
  • 修复了 Rich box 导入问题。
  • 修复了 FFmpeg 错误。
  • 由于 MoviePy 引起错误,已将其替换。

🔧 改进:

  • 提高代码可读性。
  • 增加了 Hugging Face 镜像站点的检查。
  • 修改了音频提取方式,保留原始音频,仅在 Whisper 处理过程中进行压缩,从而提高了音频质量。
  • 批量执行 TTS。

v1.8.0

13 Nov 16:25
Compare
Choose a tag to compare

Release Notes

🔧 Improvements:

  • Refined and optimized the overall code structure for better performance and maintainability.
  • Enhanced the prompt to be more concise and applicable to a wider range of models.
  • Improved fuzzy and precise matching in translation processes.

🐛 Bug Fixes:

  • Fixed several errors occurring during the FFmpeg compression process.
  • Resolved an issue with phrase errors caused by lack of initialization and null returns from model translations.

📝 Updates:

  • Demucs vocal separation is no longer performed by default before transcription, addressing the issue of missing sentences and improving processing speed.
  • Removed support for the whisperX replicate API to simplify the project as an open-source initiative.
  • Adjusted the translation process to handle smaller segments, reducing the likelihood of errors.

发布说明

🔧 改进:

  • 精简优化了整体代码结构,提高了性能和可维护性。
  • 优化了提示词,使其更加精简,适用于更多模型。
  • 改进了翻译过程中的模糊和精确匹配。

🐛 问题修复:

  • 修复了在 FFmpeg 压制过程中发生的一些错误。
  • 解决了由于未初始化和模型翻译返回空值导致的 phrase 错误。

📝 更新:

  • 默认不在转录前进行 Demucs 人声分离,以解决遗漏句子的问题并提高处理速度。
  • 删除了 whisperX replicate API 的支持,以简化开源项目。
  • 调整翻译过程为处理更小的块,减少错误的可能性。

v1.7.1

11 Nov 07:09
Compare
Choose a tag to compare

Release Notes

🚀 New Features:

  • Added MPS support for Demucs to improve performance.
  • Implemented an error retry mechanism for batch processing.
  • Set the default to use the Gemini model and updated documentation accordingly.
  • Added an auto-update feature for ytdlp.

🐛 Bug Fixes:

  • Corrected a long video segmentation error.
  • Fixed loading issues with the local Chinese Whisper model.
  • Improved audio splitting robustness and encoding handling.
  • Resolved issues with handling reference audio prerequisites for GPT-SoVITS batch processing.
  • Correctly implemented retry on translation failures.

🔧 Improvements:

  • Updated the CPU-specific torch version in the installation process.
  • Refactored to simplify the prompt reasoning chain due to minimal improvement.

📝 Updates:

  • Removed the install option in the OneKey batch script.
  • Added a section for the SaaS website in the documentation.

🚀 新功能:

  • 为 Demucs 增加了 MPS 支持以提升性能。
  • 实现了批处理的错误重试机制。
  • 设置默认使用 Gemini 模型并相应更新了文档。
  • 新增 ytdlp 自动更新功能。

🐛 问题修复:

  • 修正了长视频分段错误。
  • 修复了本地中文 Whisper 模型的加载问题。
  • 改进了音频分割的鲁棒性和编码处理。
  • 解决了 GPT-SoVITS 批处理的参考音频前提条件问题。
  • 正确实现了翻译失败时的重试机制。

🔧 改进:

  • 更新了安装过程中与 CPU 相关的 torch 版本。
  • 重构以简化提示推理链条,因其改进效果有限。

📝 更新:

  • 移除了 OneKey 批处理脚本中的安装选项。
  • 在文档中添加了 SaaS 网站的部分。