Text to Speech
Convert written text into natural-sounding speech using AI-powered voices in over 150 languages and dialects.
Text to Speech
Text-to-Speech (TTS) is the core feature of Verbatik AI. It converts written text into natural-sounding speech using AI-powered voices in over 150 languages and dialects.
How It Works
- Navigate to Text to Speech from the sidebar.
- Type or paste your text into the editor.
- Select a voice from the voice selector.
- Adjust settings if needed (speed, pitch, style).
- Click Generate to create your audio.
- Preview, download, or share the result.
Voice Selection
Verbatik offers a rich library of voices:
Platform Voices
- 1,700+ premium neural voices available on paid plans.
- Filter by language, gender, or search by name.
- Preview any voice before generating.
- Voices support multiple languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, and 140+ more.
Custom Voices
- Use your own cloned voices (see Voice Cloning).
- Custom voices appear in a dedicated tab in the voice selector.
- Includes Regular, HD, and Designed voice types.
Favorite Voices
- Mark any voice as a favorite by clicking the heart icon.
- Favorites appear in a dedicated tab for quick access.
Voice Settings
Depending on the voice type, you can adjust:
- Speed — Control how fast or slow the voice speaks.
- Pitch — Adjust the tone higher or lower.
- Speaking Style — Some voices support styles like cheerful, sad, angry, whispering, newscast, and more.
- SSML Controls — Advanced users can use SSML (Speech Synthesis Markup Language) for fine-grained control over pauses, emphasis, pronunciation, and more.
HD Voice Settings
When using HD (high-definition) cloned voices, additional settings are available:
- Language Selection — Choose from 16 supported languages: English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, and Hindi.
- Speed Control — Fine-tune the speaking rate.
Character Limits
- Standard limit: Up to 25,000 characters per generation.
- Throttled limit: If you exceed your fair use allocation, the limit is reduced to 2,500 characters per generation (see Fair Use Policy).
Credit Cost
- Standard TTS: 1 credit per character.
- Pro and Enterprise plans: Unlimited TTS at no credit cost.
Example: Generating 500 characters of speech costs 500 credits on Starter, Creator, or Essential plans, and 0 credits on Pro or Enterprise.
Downloading Audio
After generation:
- Click the Download button to save the audio file.
- Files are generated in MP3 format.
- No watermarks on any paid plan.
Sharing Generations
You can share any generated audio clip:
- Click the Share button on a generation.
- A unique share link is created.
- Anyone with the link can listen to the audio.
- Share links are active for 6 months.
- You can deactivate a share link at any time.