podkastomat/README.md
2025-01-23 16:37:16 +10:30

1.7 KiB

Podkastomat

🧰 A tool for automatically downloading and transcribing podcasts, with the option to translate them into English.

📚 Built for language learners.

Usage

To add a podcast:

./update-config add

This will ask you for the name, language code, and the RSS URL of the podcast, as well as whether you want the main process to automatically download the latest episode of this podcast when it runs.

To configure translating podcasts from a given language - in this example, German - into English (without this step, podcasts will be transcribed but not translated).

./update-config translate de

To process all configured podcasts

./process

To fetch and process the earliest 3 episodes of a particular podcast

./process 'some podcast' old 3

Online help for additional options is available via ./process --help

Downloaded episodes, and generated transcripts and translations, will be stored in podcasts/{language}/{podcast_name}. E.g.: podcasts/de/mission_klima_-_lösungen_für_die_krise

Manual configuration

Feel free to edit the config.json file

System Requirements

  • Linux (but it probably works on other platforms 🤷)
  • Python 3 (tested on 3.8.10)
  • One or both of the following:
    • Whisper (supports transcripts and translations)
    • Vosk (supports transcripts)
  • Mutagen

Notes

Whisper (used by default for transcriptions and translations) is quite slow and resource intensive.
It may be worth running the process script overnight, e.g. as a cron job.