API keys for audiobook generation

Folio  ·  setup guide

Folio's audiobook feature lets you turn any chapter into spoken audio using Google Cloud Text-to-Speech or ElevenLabs. Both require a free API key from the provider. This guide walks through getting one for each, in plain steps, with no jargon.

Jump to: Why API keys? Google TTS ElevenLabs Custom voices Privacy Which to pick?

Why does Folio need an API key?

Generating speech from text isn't free for the provider — somebody has to run the AI models that turn your manuscript into a human voice. Both Google and ElevenLabs offer generous free tiers (enough for several short audiobooks per month), but they need a key to track which account is using their service.

Folio doesn't have its own TTS service. Instead, you bring your own key, and Folio sends your manuscript text directly to the provider on your behalf. The key lives in your browser only — Folio's servers never see it.

Google Cloud Text-to-Speech

Google's TTS produces clean, professional voices. The free tier is huge (millions of characters per month). Setup takes about 5–10 minutes the first time.

Step 1 — Sign in to Google Cloud Console

Go to console.cloud.google.com and sign in with your Google account. If you've never used Google Cloud before, it'll ask you to accept the terms. Accept them.

Step 2 — Create a project

At the top of the page there's a project picker (it says “Select a project” or shows your current project name). Click it, then click New Project. Name it whatever you like — “Folio Audio” works. You can leave the organization field as “No organization.” Click Create.

Wait about 30 seconds for the project to spin up, then make sure it's selected in the project picker before continuing.

Step 3 — Enable the Text-to-Speech API

In the search bar at the top, type Text-to-Speech and hit enter. Click the result called Cloud Text-to-Speech API. On the next page, click the blue Enable button.

If it asks you to set up a billing account, don't panic — you have to add a card for verification, but you'll stay in the free tier as long as you're under the monthly quota. (Google sends an email if you ever approach the limit.)

Step 4 — Create the API key

In the left sidebar, click APIs & ServicesCredentials. Or use the search bar to jump to “Credentials.”

At the top, click + CREATE CREDENTIALSAPI key. A dialog pops up showing your new key — it looks like AIzaSyA... followed by random characters. Copy this. Don't close the dialog yet.

Step 5 — Restrict the key (important)

Right below the key, click Edit API key (or hit close and click your new key in the credentials list). On the edit page:

  1. Under API restrictions, select Restrict key.
  2. From the dropdown, check only Cloud Text-to-Speech API.
  3. Click Save at the bottom.

This means even if someone got hold of your key, they couldn't use it for anything other than text-to-speech — no surprise charges for services you don't use.

Step 6 — Paste it into Folio

Back in Folio, open the audiobook panel (the 🎙 button in the export drawer). Pick Google TTS as the provider. Paste your key into the Google Cloud TTS API Key field. Folio will remember it in your browser; you won't have to enter it again on this device.

Where the key lives. Your API key is stored in your browser's local storage and synced to your Folio account if you're signed in. It's sent to Google directly when you generate audio. Folio's servers never store it.

Free tier — what fits?

Google's free tier resets monthly. You get:

For most authors generating audiobooks, this is more than enough. Past the limit, you pay around $4–$16 per million characters depending on the voice tier.

ElevenLabs

ElevenLabs produces some of the most natural AI voices currently available — if you've heard an AI narration that sounded almost human, it was probably ElevenLabs. The free tier is smaller than Google's, but the voice quality is the reason most authors pick it for fiction.

Step 1 — Sign up

Go to elevenlabs.io and click Sign Up at the top right. You can use your Google account, or an email and password.

Step 2 — Find your API key

Once signed in, click your profile icon at the top right → API Keys. Or go directly to elevenlabs.io/app/settings/api-keys.

Click Create API Key. Give it a name like “Folio.” You can leave permissions at the defaults. Click Create.

The key appears once, starting with sk_... Copy it now — ElevenLabs won't show it again. If you lose it, you'll have to delete the key and create a new one.

Step 3 — Paste it into Folio

Open Folio's audiobook panel (the 🎙 button). Pick ElevenLabs as the provider. Paste your key into the ElevenLabs API Key field. Pick a model from the dropdown:

Free tier — what fits?

ElevenLabs' free tier gives you 10,000 characters per month — that's roughly 1,500 words, or one short chapter. Enough to test the voice quality and hear how your manuscript sounds, but not enough to generate a full book.

Paid tiers as of this writing:

PlanPer monthCharacters/month
Free$010,000
Starter$530,000 (~ short novella)
Creator$22100,000 (~ short book)
Pro$99500,000 (~ full novel)

For a single audiobook of a full novel, the Creator or Pro tier covers it for a one-month subscription — cancel after generation if you don't need ongoing capacity.

Custom voices in ElevenLabs

One of ElevenLabs' best features: you can create a voice from a sample, then narrate your book in that voice. Useful for authors who want a consistent narrator across multiple books, who want their own voice as the narrator, or who want a voice that matches a specific character.

Two kinds of cloning

Instant Voice Clone — available on the Starter plan and above. You upload 1–5 minutes of clean audio (just speech, no music), and ElevenLabs creates a voice from it in seconds. Quality is decent but not professional-grade. Free tier doesn't include cloning.

Professional Voice Clone — available on the Creator plan and above. You upload 30+ minutes of high-quality audio (good microphone, no background noise, even pacing) and ElevenLabs trains a higher-quality clone over a few hours. Indistinguishable from the real voice if your sample is good.

How to create one

  1. In ElevenLabs, go to Voices in the left sidebar.
  2. Click Add a new voice.
  3. Pick Instant Voice Clone or Professional Voice Clone based on your plan.
  4. Upload your audio sample(s). For Instant: 1–5 minutes is plenty. For Professional: aim for at least 30 minutes of clean studio-quality speech.
  5. Give the voice a name and a description. Save.

Using your custom voice in Folio

It just works. Once you've created a voice in ElevenLabs, go back to Folio's audiobook panel and reload the voice list (close and reopen the panel, or refresh the page). Your custom voice will appear in the dropdown alongside ElevenLabs' default voices, labelled with its category — e.g., MyVoice · cloned or MyVoice · professional.

Pick it like any other voice and generate. Folio talks to the same ElevenLabs API regardless of whether the voice is a default or a clone, so no extra setup is needed on the Folio side.

Tip on voice samples. If you're recording your own voice for a clone, pick a quiet room, sit close to the microphone, and read varied sentences (mix of dialogue, narration, questions, exclamations) so the clone learns your full range. A 30-minute audiobook recording at consistent volume is far more useful than 60 minutes of whispered or uneven audio.

How Folio handles your keys

Both keys live entirely in your browser's local storage. When you're signed in, they sync to your Folio user settings document so you can access them from another device — that document is private to your Google account.

When you generate audio, your manuscript text plus the API key go directly to the provider's servers (Google or ElevenLabs). Folio's worker is a thin proxy that adds CORS headers and forwards the request — it doesn't log keys, doesn't log text, doesn't persist anything.

You can revoke either key at any time from the provider's dashboard without touching Folio — the key just stops working, and Folio will tell you the next time you try to generate audio.

Which one should I pick?

Quick decision matrix:

If you want…Pick…
Free generation of a full book Google TTS — the free tier is huge.
The most natural-sounding voice for fiction ElevenLabs Multilingual v2.
Your own voice as narrator ElevenLabs Creator plan + Professional Voice Clone.
Multiple voices per book (different characters) ElevenLabs — create a voice per character. Folio narrates the whole chapter in one voice for now, but per-segment voice picking is on the roadmap.
To test the feature without spending Either — both have free tiers.
Volume work (10+ books) Google for cost; ElevenLabs Pro for quality.

Most authors using Folio for fiction land on ElevenLabs for the voice quality, even at the cost of a $5–$22 month here and there. For non-fiction or technical content where voice character matters less, Google's free tier is plenty.

Stuck somewhere in this guide? Email folio@jacobsiler.com and mention which step. Screenshots help a lot.