css Copy code
This repository contains a Python script (tacotron2_preprocessor.py
) that preprocesses audio files for training a Tacotron 2 text-to-speech model. The script trims silence, normalizes the audio, and saves the processed files to a specified output folder. It's specifically designed to work with .wav files to help create a clean and consistent dataset for Tacotron 2 model training.
- Python 3.6 or higher
- Librosa
- SoundFile
- Clone this repository to your local machine.
git clone https://github.com/yourusername/tacotron2-audio-preprocessor.git
markdown Copy code
- Install the required libraries:
pip install librosa soundfile
r Copy code
- Update the
input_path
andoutput_path
variables in thetacotron2_preprocessor.py
script to point to your input folder containing the .wav files and the desired output folder for the processed files.
input_path = "path\\to\\your\\input_folder"
output_path = "path\\to\\your\\output_folder"
Run the tacotron2_preprocessor.py script:
Copy code
python tacotron2_preprocessor.py