Automated parsing of Amazon-Transcribe-annotated episode transcripts

This is an automatic parser of Amazon Transcribe jobs - of podcast episodes - which outputs to HTML.

Compatible with Python 2.7 and Python 3+.

Just populate an input file, called input.txt, where each line is semi-colon-separated and contains the name of the Amazon Transcribe output JSON file, and a comma-separated, ordered list of speakers.

For example, the following input.txt file will result in the iterative processing of files episode_1.json, episode_2.json and episode_3.json. speaker_1, speaker_2 and speaker_3 will replace the automatically generated placeholders spk_0, spk_1 and spk_2. The output HTML files will be named after the jobName from each input JSON file.

episode_1.json;speaker_1,speaker_2
episode_2.json;speaker_2,speaker_3
episode_3.json;speaker_1,speaker_2,speaker_3

Once you've created your input.txt file and moved it in the same directory as the process_aws_output.py file, you simply need to run the script with Python:

$ python process_aws_output.py
SUCCESS!

A SUCCESS! message is expected, signifying that all HTML outputs have been stored in the same directory.

Please, don't hesitate to ask questions or request changes or improvements via the Issues section.

If you're feeling generous, donations are welcome:

BTC: 1QFNgTV3GQby8uv3mXwLKBHAgKUEenSREd

ETH: 0xa7350d9fb3c6193759b587bb984f0dfe3568c8ed

LTC: LW3SNJ61CXUfRQTpehpDfV7vv1iVdLh9En

ADA: DdzFFzCqrhtBbS7o5LQ3u1ZxFVz3Q6b2bQ86FEYanf6UsRgK6D3So4grpZEHPXcitQWEuRfnAA7jzi3xmj9Md6kng2UiVn4QLxEsAefK

BCH: 1QFNgTV3GQby8uv3mXwLKBHAgKUEenSREd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated parsing of Amazon-Transcribe-annotated episode transcripts

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
input.txt		input.txt
process_aws_output.py		process_aws_output.py

crypto-jeronimo/aws-transcription-parser

Folders and files

Latest commit

History

Repository files navigation

Automated parsing of Amazon-Transcribe-annotated episode transcripts

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages