FAQ

Real questions.
Honest answers.

The short version: flexVox is an iOS app, it uses ElevenLabs for voice generation, demo mode works without an account, your API keys live in the iOS Keychain, and nothing leaves your device unless you ask it to.

General

What is flexVox?

flexVox is an iOS app that turns multi-speaker scripts into produced podcast audio. You paste a dialogue script, assign AI voices to each character, generate speech and sound effects, then mix and export the result — all on iPhone or iPad.

Do I need an ElevenLabs account?

For real audio generation, yes. You'll need an ElevenLabs API key (free tier available at elevenlabs.io). flexVox stores the key in the iOS Keychain. A built-in demo mode works without any account — it generates silent placeholder audio so you can explore every screen and feature first.

What devices does flexVox support?

iPhone and iPad running iOS 26.4 or later.

Does flexVox work offline?

Script import, parsing, and review work offline. Audio generation requires an internet connection to reach the ElevenLabs API. Demo mode works fully offline.

Scripts & parsing

What script formats are supported?

Colon format (HOST: Welcome), bracket format ([Host] Welcome), parenthesis format ((Host) Welcome), and a standalone name on its own line followed by dialogue beneath. SFX and music tags use [SFX: prompt (duration)] and [Music: prompt (duration)], with optional modifiers @underlay, volume=, loop, and influence=.

What happens if the parser gets a speaker wrong?

Every attribution has a confidence score. Low-confidence turns are highlighted for review. Tap any turn to reassign, batch-assign unreviewed turns, or merge duplicate speakers.

Can I edit the script after parsing?

Yes. The Script Edit sheet lets you modify the raw text and re-parse — but re-parsing replaces all existing speaker assignments and deletes any generated audio for the project. flexVox warns before doing this.

Voices & generation

How many voices can I use in one project?

No hard limit on the number of speakers. Each gets their own voice profile. Dialogue batch mode batches up to 10 unique voices per API call; larger casts auto-split across batches.

Can I fine-tune how a voice sounds?

Yes. Each voice profile has independent sliders for stability, similarity boost, style exaggeration, and speed, plus a speaker boost toggle. v3 models use a simplified control set.

What are expression tags?

ElevenLabs v3 audio directives that control how a voice performs a line — like [happy], [whispers], [laughs], [sighs]. Place them inline in dialogue text. Stack tags for complex emotions ([angry][laughing]). Write custom descriptive tags like [trying to sound brave] — v3 interprets natural language in brackets.

Can I add expression tags to an existing script?

Yes. Use Enhance Expression Tags from the toolbar (or long-press a single turn). The app uses Apple Intelligence on-device when available, or falls back to your configured AI provider (OpenAI or Claude). Suggestions are context-aware — surrounding dialogue informs each tag. Review each before applying.

What modifiers can I add to SFX and music tags?

@underlay plays the audio under dialogue instead of sequentially. volume=0.3 sets underlay volume (0–1). loop generates seamless looping audio. influence=0.8 controls how literally the AI interprets the prompt (0–1). Example: [SFX: rain on windows (5s) @underlay volume=0.3 loop].

What if generation fails or I cancel it?

Audio that was already generated is preserved. Resume picks up where it stopped, skipping completed turns. Network errors and rate limits are retried automatically with exponential backoff.

Post-production

Can I regenerate just one line?

Yes. Swipe the turn or use the context menu to regenerate. The new audio saves as a variant — your previous take is not overwritten.

What are variants?

Multiple takes per turn. When you regenerate, the new take is added alongside the existing ones. Play each, pick the best, delete the rest.

How do I control pauses between lines?

Two levels: a project-wide default pause set in post-production, and per-turn custom pauses set with the Adjust Pauses toggle and individual sliders (0–10 seconds). Custom pauses override the default for that turn.

What format is the exported audio?

M4A (AAC). Normalized to your chosen export preset's LUFS target.

Shows & series

What is a show / production?

A container for a series of related episodes. It stores a persistent cast bible with character profiles and voice assignments, a default format and tone, audio identity (intro / outro music), and an episode template. New episodes inherit all of it.

Do I have to use shows?

No. Shows are entirely optional. Standalone projects work exactly as they did before. You can promote any project to a series later by long-pressing it and choosing "Create Show from This."

Can I reuse cast members across different shows?

Cast members are stored per production. To use the same character in a different show, add them again in that show's cast bible. Voice assignments (ElevenLabs voice IDs) are portable since they reference the same voice catalog.

AI script writing

Which AI services does flexVox support for script writing?

OpenAI (GPT-4o) and Anthropic (Claude). Set up your API key in Settings under AI Script Writing. Switch between providers at any time.

Can I use my own AI tool instead?

Yes. The AI Prompt Sheet has Copy to Clipboard. Paste it into ChatGPT, Claude, or any other tool, then paste the generated script back into flexVox.

How much does AI script generation cost?

The app shows an estimated cost before generation based on your prompt length and selected provider. Costs depend on your provider's pricing. flexVox itself charges nothing for this feature — you pay your AI provider directly.

Privacy & security

How is my API key stored?

In the iOS Keychain — the same secure storage the operating system uses for passwords. It is never written to a plain file or sent anywhere other than the ElevenLabs API.

Does flexVox collect any data?

No analytics frameworks, tracking SDKs, or server-side data collection. Your scripts, audio files, and projects stay on your device. The only network calls are to ElevenLabs (audio) and your AI provider of choice (script writing).