Verbatik LogoVerbatik

Text to Speech

Convert written text into natural-sounding speech using AI-powered voices in over 150 languages and dialects.

Text to Speech

Text-to-Speech (TTS) is the core feature of Verbatik AI. It converts written text into natural-sounding speech using AI-powered voices in over 150 languages and dialects.

How It Works

  1. Navigate to Text to Speech from the sidebar.
  2. Type or paste your text into the editor.
  3. Select a voice from the voice selector.
  4. Adjust settings if needed (speed, pitch, style).
  5. Click Generate to create your audio.
  6. Preview, download, or share the result.

Voice Selection

Verbatik offers a rich library of voices:

Platform Voices

  • 1,700+ premium neural voices available on paid plans.
  • Filter by language, gender, or search by name.
  • Preview any voice before generating.
  • Voices support multiple languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, and 140+ more.

Custom Voices

  • Use your own cloned voices (see Voice Cloning).
  • Custom voices appear in a dedicated tab in the voice selector.
  • Includes Regular, HD, and Designed voice types.

Favorite Voices

  • Mark any voice as a favorite by clicking the heart icon.
  • Favorites appear in a dedicated tab for quick access.

Voice Settings

Depending on the voice type, you can adjust:

  • Speed — Control how fast or slow the voice speaks.
  • Pitch — Adjust the tone higher or lower.
  • Speaking Style — Some voices support styles like cheerful, sad, angry, whispering, newscast, and more.
  • SSML Controls — Advanced users can use SSML (Speech Synthesis Markup Language) for fine-grained control over pauses, emphasis, pronunciation, and more.

HD Voice Settings

When using HD (high-definition) cloned voices, additional settings are available:

  • Language Selection — Choose from 16 supported languages: English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, and Hindi.
  • Speed Control — Fine-tune the speaking rate.

Character Limits

  • Standard limit: Up to 25,000 characters per generation.
  • Throttled limit: If you exceed your fair use allocation, the limit is reduced to 2,500 characters per generation (see Fair Use Policy).

Credit Cost

  • Standard TTS: 1 credit per character.
  • Pro and Enterprise plans: Unlimited TTS at no credit cost.

Example: Generating 500 characters of speech costs 500 credits on Starter, Creator, or Essential plans, and 0 credits on Pro or Enterprise.


Downloading Audio

After generation:

  • Click the Download button to save the audio file.
  • Files are generated in MP3 format.
  • No watermarks on any paid plan.

Sharing Generations

You can share any generated audio clip:

  1. Click the Share button on a generation.
  2. A unique share link is created.
  3. Anyone with the link can listen to the audio.
  4. Share links are active for 6 months.
  5. You can deactivate a share link at any time.

On this page