Skip to content
01 · Script

Paste your script. Confirm the lines we weren't sure about.

The editor uses a serif typeface and generous spacing — a calm, literary writing feel. Type or paste any dialogue script. The parser detects speakers across multiple formats:

  • HOST: Welcome to the show.
  • [Host] Welcome to the show.
  • (Host) Welcome to the show.
  • A standalone name on its own line, dialogue beneath.

Every detected attribution gets a confidence score. Low-confidence lines are highlighted; the rest is already done. Batch-assign unreviewed turns or merge duplicate speakers with one tap.

[SCENE: cold open]

ALEX: You hear that?

ALEX: The whole city is on a delay.

JORDAN: [whispers] Three seconds behind the lightning.

[SFX: distant thunder (3s) @underlay volume=0.4]

What the parser sees
02 · Voice Mapping

Give every character a distinct voice.

Open the Voice Library to search the full ElevenLabs catalog, preview voices instantly, and assign them with the casting session open — it auto-advances to the next unvoiced speaker as you go. A counter ("3 of 5 cast") sits at the top.

Or tap Auto-Cast All and let flexVox pick distinct voices for your whole cast based on each speaker's role in the script — manually-assigned voices are preserved.

Open Fine-tune Voice per speaker to dial stability, similarity, style, and speed. The fine-tune section is collapsed by default so your default view stays clean.

Cast all speakers2 of 2 cast
A
ALEX warm, considered, narrator-leaning
Briar · multilingual v2
J
JORDAN hushed, close-mic, slight rasp
Roan · turbo v2.5
stability 0.45 · style 0.30 · speed 0.97
03 · Generation

Generate dialogue, SFX, and music — in one pass.

The Ready screen is one summary: turn count, speaker count, a background music toggle, and a generation mode picker. Nothing hidden behind disclosure groups. Tap Generate.

flexVox calls ElevenLabs for every line — speech with bidirectional voice context for natural continuity, SFX from your tags, music from your prompts. Background music (when enabled) generates in a second pass sized to your finished episode duration.

When something fails, generation doesn't stop. Failed turns get a badge and an error detail; network errors and rate limits are retried automatically. Cancel anytime — your already-generated audio is preserved.

  • NowGenerating turn 38 of 50
  • ETA00:42
  • Elapsed02:14
  • Status2 retries · 0 failed
04 · Post-Production

Regenerate one line. Compare takes. Move on.

Each turn has a play button, status badges, and a flag toggle. Tap a turn to start playback from there. Play from Here advances through the rest of the script automatically.

Swipe right on any turn to regenerate. The new audio saves as a variant — your prior takes are never overwritten. Open the variant picker to compare side-by-side and pick the best.

Mark music or SFX as underlay to play it under dialogue. LUFS-aware auto-ducking measures both tracks and computes the right level on its own. Or set ducking depth, attack, and release by hand.

ALEX

You hear that? The whole city is on a delay.

00:00.0 → 00:03.4
JORDAN

[whispers] Three seconds behind the lightning.

00:03.6 → 00:07.1 2 takes
SFX

distant thunder · @underlay · volume 0.40

00:00.0 → 00:09.0
ALEX

[laughs] Promise me we'll never count down together.

00:07.3 → 00:11.0 RECAST
05 · Export

Mix, normalize, share.

The Export tab is a live teleprompter. Each turn lists with a timestamp and a speaker badge; the active turn highlights and auto-scrolls into view. When alignment data is available, words highlight in real time — bold and brand-colored as they're spoken.

Press Cmd+E or tap Export. Pick a platform preset: Apple Podcasts (−16 LUFS), Spotify (−14 LUFS), YouTube (−14 LUFS), Broadcast (−23 LUFS), or Custom with a slider from −30 to −6. Transcript exports (SRT, VTT, JSON, plain text) live in the same sheet. Share via the iOS share sheet.

M4A · AAC · normalized · ready for any host

Apple Podcasts −16 LUFS
Optimal loudness for Apple Podcasts
Spotify −14 LUFS
Matches Spotify normalization
YouTube −14 LUFS
Loud, clear, social-friendly
Broadcast −23 LUFS
EBU R128 broadcast spec
Custom −30 to −6
Slider to any delivery target
On air

That's the whole pipeline.

Five steps, three tabs, one M4A. Download and walk through it yourself — demo mode works without any account.