doc2convo • walterra.dev

I spent parts of last weekend on a side hustle using Claude Code to find out how far I could get to come up with a poor man’s cli-only version of the turn-a-document-into-a-podcast pattern.

doc2convo is what I came up with. It’s basically 2 python scripts you can pipe together to turn URLs, PDFs or markdown files into a mp3 podcast with 2 hosts.

doc2md-convo.py handles reading an URL or doc and uses Anthropic’s Claude SDK to transpose the source into a conversational markdown file. It supports an optional source prompt to influence the discussion.
md-convo2mp3.py then parses the generated file and maps speakers to distinct voices using Microsoft Edge’s TTS service.

Here’s a self-referential example:

python3 doc2md-convo.py README.md \
-s "cut the basics for a technical audience, throw in some meta sarcasm" \
| python3 md-convo2mp3.py - -o README-podcast.mp3

You get the idea.

More fun with system prompts:

python3 doc2md-convo.py README.md \
-s "the hosts are 2 x-files agents that got access to this new case. ALEX is undercover MULDER. JORDAN is undercover SCULLY. MULDER insists on paranormal findings. SCULLY sticks to the facts. They occasionally break character or the 4th wall." \
| python3 md-convo2mp3.py - -o README-x-files.mp3

Here’s a podcast that roasts my website:

python3 doc2md-convo.py https://walterra.dev \
-s "Make it a roasting comedy show" \
| python3 md-convo2mp3.py - -o walterra-dev-roast.mp3

For reference, here’s the system prompt to create the markdown conversation:

prompt_template = """You are tasked with creating a conversational podcast transcript from web content.

Create a dialogue between two podcast hosts named ALEX and JORDAN. They should discuss the content in an engaging, informative way with natural conversation flow.

Key requirements:
- Format as markdown with **SPEAKER:** prefix for each line
- ALEX tends to be {alex_role}
- JORDAN tends to be {jordan_role}
- Include natural conversation elements like reactions, follow-up questions, and transitions
- You must not use any code blocks or markdown formatting other than the **SPEAKER:** prefix
- You must not use any markdown elements like *laughs* or *pauses*; instead, use natural dialogue
- Make it feel like a real podcast discussion, not just reading facts
- Keep it engaging and accessible
- Length should be substantial but not excessive (aim for 15-25 exchanges)

{system_prompt_section}

Source: {source}
Title: {title}

Content to discuss:
{content}...

Generate the podcast transcript:"""

Here’s the repo with more documentation: https://github.com/walterra/doc2convo

Depending on your preferences, you could use different APIs for the generation of course, but since I have an Anthropic subscription I went with that to create the markdown files. And edge-tts doesn’t require yet another paid API so it seems like a good compromise for this quick PoC.

There are quite some other NotebookLM clones out there including UIs, but I stuck to this CLI only variant to be able to maybe easily include it into some other automated workflows, like a chatbot that would turn bookmarks into audio transcripts inspired by geoffreylitt’s stevensDemo.