-
Notifications
You must be signed in to change notification settings - Fork 2
Wav2vec resources
Seongjin Park edited this page Jul 7, 2021
·
13 revisions
Wav2Vec tools:
- Huggingface
- Advantage: compatible with other huggingface layers
- Disadvantage: currently not supporting language model for decoding, will be updated someday.
- Good resource to take a look at: Fine-Tune Wav2Vec for English ASR with Huggingface Transformers
- If you want to follow the provided resource with your custom dataset, take a look at this notebook.
- Fairseq
- Advantage: support language model for decoding. YAML file for all hyperparameters.
- Disadvantage: requires large computation power (original study used 24 gpu, and it shows OOM error when running on Google Colab)
- TODO: Upload custom scripts that wrote to preprocess the data structure.
- Install flashlight:
- How to fix CANNOT FIND FFTW3LibraryDependency error?: github issue
- export MKLROOT path: github issue
- SpeechBrain
- Advantage: compatible with huggingface. Support language model. Good documentation
- Disadvantage: requires python >= 3.8. need to understand the structure of yaml files and custom functions.
- TODO: