Paste a script · get a mixed podcast episode

Annotate the script with speaker names. The Agent maps each speaker to a voice through `vb tts`, renders the lines, drives ffmpeg to lay them on a timeline with BGM that ducks under speech, and exports a master MP3 plus per-speaker stems. Apple-Podcasts-grade loudness by default.

A peek at what you get

Voice cast
Page 1 of 4·Voice cast
Timeline & mix
Page 2 of 4·Timeline & mix
Loudness report
Page 3 of 4·Loudness report
Delivery files
Page 4 of 4·Delivery files

Voice cast · each speaker locked to a voice

Picked from `vb tts providers`. Saved with this episode — character recurs in episode 13, same voice. No drift, no re-explaining.

Host
ElevenLabs
eden · warm male · 30s

Anchors the conversation · neutral US English · light pacing

Guest
OpenAI tts-1-hd
shimmer · founder energy · 40s

Brings the heat · expressive · code-switches into Mandarin cleanly

BGM
Vecbase Free Library
calm-loops/glass-piano-04 · CC-0

Sidechain ducks under speech · loops cleanly to fill 28 min

Timeline + sidechain-ducked BGM

Speech segments push the BGM down 8 dB underneath them — drawn here for the first 35 seconds so you can see exactly how the mix is built.

Host
Guest
BGM
00:0000:1000:2000:3000:35
vb tts synthesizehost:eden / guest:shimmer · 412 lines · 48kHz
ffmpeg amix + sidechaincompressthreshold=-32 ratio=8 release=200ms
loudnormI=-16 LUFS · TP=-1 · LRA=8 · pass=2
bgm.loopcalm-loops/glass-piano-04 · 28-min seamless
Delivery packageApple Podcasts loudness target
master.mp3
38 MB · 320 kbps
host.stem.mp3
14 MB
guest.stem.mp3
12 MB
bgm.stem.mp3
8 MB

How it works

Step 01

Mark up the script

Tag who says what — "[Host]:", "[Guest]:", or your own naming. Plain text, Markdown, .docx, even Fountain screenplay format all work. The Agent infers tone, age, energy from the surrounding text and picks voices accordingly.

Step 02

Pick the delivery format

Two-host podcast with BGM? 30s commercial in three lengths? Audiobook chapter with separate character voices? Tell the Agent what the deliverable is and the right pipeline (BGM ducking / loudness target / chapter markers) snaps into place.

Step 03

Pick up the master + stems

A zip lands in your Drive: master.mp3 ready to ship, per-speaker stems for the editor, a separate BGM stem, and a loudness compliance report (integrated, true peak, range). Hand the stems to your editor for tweaks — you do not need to re-run from scratch.

Why Vecbase for this

It mixes — not just stitches

Most AI voiceover tools concatenate clips with ugly transitions. The Agent runs ffmpeg with sidechain compression so the BGM ducks under speech, applies fade in/out on every line, and equalizes loudness across speakers so the guest doesn't suddenly sound twice as loud as the host.

Voice picks are anchored to the script, not random

The Agent reads the script context ("the founder is in her 40s, calm but direct") and picks a voice that fits — and once picked, the same character keeps that voice across episodes. The mapping is saved as a small `.json` so the next chapter / next episode is automatically consistent.

Loudness compliance is baked in

Each delivery target has a different loudness spec — Apple Podcasts wants -16 LUFS integrated, broadcast wants -23, social videos want -14. Pick the target and the Agent measures the final mix and re-renders until it passes. The compliance report is included in the zip so the platform never bounces your upload.

Per-speaker stems mean the editor can finish

You can take the host stem, the guest stem, and the BGM stem straight into ProTools / Logic / Audition. No "AI black box" — every layer is exposed, every level is documented. The Agent did the boring 80%; your editor adds the last 20% of polish.

Frequently asked

Whichever is best for the language and tone. `vb tts` exposes ElevenLabs (warm, English-dominant), OpenAI tts-1-hd (clean, multilingual), 字节豆包 (best Mandarin / Cantonese / dialects), and others — and the Agent picks based on the script. You can pin a provider in chat ("use ElevenLabs only") and `vb tts providers` lists everything available.

Get yours in under 90 seconds

Sign in, hand it over to the Agent — the finished file lands in your Drive.