API Reference
Complete REST API endpoint reference for all Verbatik endpoints.
API Reference
Complete reference for all Verbatik REST API endpoints. All endpoints require authentication via API key.
Base URL
Authentication
Endpoints Overview
| Method | Endpoint | Description |
|---|---|---|
GET | /api/v1/voices | List pre-trained voices |
POST | /api/v1/tts | Text-to-speech synthesis |
POST | /api/v1/voice-training | Clone a voice from audio |
POST | /api/v1/voice-design | Design a voice from description |
POST | /api/v1/voice-cloning | Generate speech with a cloned voice |
GET | /api/v1/my-voices | List your cloned/designed voices |
POST | /api/v1/text-to-music | Generate music from text |
POST | /api/audio-upload | Upload an audio file |
GET /api/v1/voices
List available pre-trained TTS voices.
Query Parameters:
| Parameter | Type | Description |
|---|---|---|
language | string | Filter by language code (e.g., en-US). |
gender | string | Male, Female, Neutral. |
search | string | Search by name or language. |
Response:
POST /api/v1/tts
Convert text to speech using pre-trained voices.
| Header | Required | Description |
|---|---|---|
Content-Type | Yes | text/plain or application/ssml+xml |
X-Voice-ID | No | Voice slug. Default: jenny-en-us. |
X-Store-Audio | No | true for URL response instead of binary. |
Body: Plain text or SSML (max 25,000 characters). Cost: $0.025 per 1,000 characters.
POST /api/v1/voice-training
Clone a voice from an audio sample.
Cost: $3.00 per voice.
Response:
POST /api/v1/voice-design
Create a voice from a text description.
Cost: $3.00 per voice.
POST /api/v1/voice-cloning
Generate speech using a cloned or designed voice.
| Header | Required | Description |
|---|---|---|
Content-Type | Yes | text/plain |
X-Voice-ID | Yes | Cloned voice UUID. |
X-Store-Audio | No | true to store audio. |
X-Speed | No | 0.5–2.0 (default: 1). |
X-Volume | No | 0–10 (default: 1). |
X-Pitch | No | -12 to 12 (default: 0). |
X-Emotion | No | happy, sad, angry, fearful, disgusted, surprised, neutral. |
X-English-Normalization | No | true/false. |
X-Voice-Modify-Pitch | No | -100 to 100. |
X-Voice-Modify-Intensity | No | -100 to 100. |
X-Voice-Modify-Timbre | No | -100 to 100. |
X-Sample-Rate | No | 8000, 16000, 22050, 24000, 32000, 44100. |
X-Bitrate | No | 32000, 64000, 128000, 256000. |
X-Format | No | mp3, pcm, flac. |
X-Language-Boost | No | Language code for enhanced recognition. |
Body: Plain text (max 5,000 characters). Cost: $0.08 per 1,000 characters.
GET /api/v1/my-voices
List all cloned and designed voices in your workspace.
| Parameter | Type | Description |
|---|---|---|
status | string | Filter: pending, ready, failed. |
POST /api/v1/text-to-music
Generate music from text prompts.
Cost: $0.20 per minute of audio.
POST /api/audio-upload
Upload an audio file for use with voice cloning.
Content-Type: multipart/form-data
Returns a URL for use with the voice-training endpoint.
Common Error Responses
| Status | Description |
|---|---|
400 | Bad request — invalid parameters or missing required fields. |
401 | Unauthorized — invalid or missing API key. |
402 | Payment required — insufficient balance. |
403 | Forbidden — no access to this resource. |
404 | Not found — resource does not exist. |
429 | Rate limit exceeded — too many requests. |
500 | Internal server error — unexpected failure. |
All errors follow this format:
CORS
All endpoints support CORS: