The pipeline · top to bottom

How a paste
becomes a podcast.

flexVox is structured around three tabs — Script, Production, Export. Each step builds on the last, but every tab is always accessible. You can jump back to the script while audio is generating; the app remembers where you were.

01 · Script

Paste your script. Confirm the lines we weren't sure about.

The editor uses a serif typeface and generous spacing — a calm, literary writing feel. Type or paste any dialogue script. The parser detects speakers across multiple formats:

HOST: Welcome to the show.
[Host] Welcome to the show.
(Host) Welcome to the show.
A standalone name on its own line, dialogue beneath.

Every detected attribution gets a confidence score. Low-confidence lines are highlighted; the rest is already done. Batch-assign unreviewed turns or merge duplicate speakers with one tap.

[SCENE: cold open]

ALEX: You hear that?

ALEX: The whole city is on a delay.

JORDAN: [whispers] Three seconds behind the lightning.

[SFX: distant thunder (3s) @underlay volume=0.4]

What the parser sees

02 · Voice Mapping

Give every character a distinct voice.

Open the Voice Library to search the full ElevenLabs catalog, preview voices instantly, and assign them with the casting session open — it auto-advances to the next unvoiced speaker as you go. A counter ("3 of 5 cast") sits at the top.

Or tap Auto-Cast All and let flexVox pick distinct voices for your whole cast based on each speaker's role in the script — manually-assigned voices are preserved.

Open Fine-tune Voice per speaker to dial stability, similarity, style, and speed. The fine-tune section is collapsed by default so your default view stays clean.

Cast all speakers2 of 2 cast

ALEX warm, considered, narrator-leaning

Briar · multilingual v2

JORDAN hushed, close-mic, slight rasp

Roan · turbo v2.5

03 · Generation

Generate dialogue, SFX, and music — in one pass.

The Ready screen is one summary: turn count, speaker count, a background music toggle, and a generation mode picker. Nothing hidden behind disclosure groups. Tap Generate.

flexVox calls ElevenLabs for every line — speech with bidirectional voice context for natural continuity, SFX from your tags, music from your prompts. Background music (when enabled) generates in a second pass sized to your finished episode duration.

When something fails, generation doesn't stop. Failed turns get a badge and an error detail; network errors and rate limits are retried automatically. Cancel anytime — your already-generated audio is preserved.

NowGenerating turn 38 of 50
ETA00:42
Elapsed02:14
Status2 retries · 0 failed

04 · Post-Production

Regenerate one line. Compare takes. Move on.

Each turn has a play button, status badges, and a flag toggle. Tap a turn to start playback from there. Play from Here advances through the rest of the script automatically.

Swipe right on any turn to regenerate. The new audio saves as a variant — your prior takes are never overwritten. Open the variant picker to compare side-by-side and pick the best.

Mark music or SFX as underlay to play it under dialogue. LUFS-aware auto-ducking measures both tracks and computes the right level on its own. Or set ducking depth, attack, and release by hand.

ALEX

You hear that? The whole city is on a delay.

00:00.0 → 00:03.4

JORDAN

[whispers] Three seconds behind the lightning.

00:03.6 → 00:07.1 2 takes

SFX

distant thunder · @underlay · volume 0.40

00:00.0 → 00:09.0

ALEX

[laughs] Promise me we'll never count down together.

00:07.3 → 00:11.0 RECAST

05 · Export

Mix, normalize, share.

The Export tab is a live teleprompter. Each turn lists with a timestamp and a speaker badge; the active turn highlights and auto-scrolls into view. When alignment data is available, words highlight in real time — bold and brand-colored as they're spoken.

Press Cmd+E or tap Export. Pick a platform preset: Apple Podcasts (−16 LUFS), Spotify (−14 LUFS), YouTube (−14 LUFS), Broadcast (−23 LUFS), or Custom with a slider from −30 to −6. Transcript exports (SRT, VTT, JSON, plain text) live in the same sheet. Share via the iOS share sheet.

M4A · AAC · normalized · ready for any host

Apple Podcasts −16 LUFS

Optimal loudness for Apple Podcasts

Spotify −14 LUFS

Matches Spotify normalization

YouTube −14 LUFS

Loud, clear, social-friendly

Broadcast −23 LUFS

EBU R128 broadcast spec

Custom −30 to −6

Slider to any delivery target

On air

That's the whole pipeline.

Five steps, three tabs, one M4A. Download and walk through it yourself — demo mode works without any account.

Download flexVox See all features