Skip to main content

Voice Creation & Cloning Tutorial

V
Written by Verbatik Support
Updated this week

Welcome to Verbatik's Voice Creation & Cloning system – a powerful suite of features that allows you to create custom voice models and use them for AI text-to-speech generation! This comprehensive guide covers both voice creation (training your own voice models) and voice cloning (using those models for speech synthesis).

What is Voice Creation & Cloning?

Voice Creation & Cloning is an advanced AI system that enables you to:

  • Create Custom Voices: Upload audio samples or record your voice to train AI models

  • Voice Cloning: Use your custom voices for text-to-speech generation

  • Voice Library Management: Organize and manage your personal voice collection

  • Multi-language Support: Use your voice clones in multiple languages

  • Professional Quality: Generate speech that maintains your unique voice characteristics


Getting Started

Accessing Your Voice Library

  1. Navigate to My Voices: From your dashboard, click on "My Voices" in the sidebar navigation.

  2. Voice Management Hub: This is your central hub for creating, managing, and using custom voices.

  3. Interface Overview: View your voice library, create new voices, and manage existing ones.


Voice Creation Process

Creating Your First Voice

If you don't have any voices yet, you'll see a welcome screen with guidance.

Getting Started Screen:

  • Create First Voice Button: Prominent call-to-action to begin voice creation

  • Helpful Guidance: Clear explanation of the voice creation process

  • Professional Design: Clean, encouraging interface to get you started

Adding New Voices

Step 1: Access Voice Creation

  • Add Voice Button: Click the "Add Voice" button in your voice library

  • Plan Limits: See your current voice limit (varies by subscription plan)

  • Upgrade Options: Clear guidance if you've reached your plan limit

Step 2: Voice Information

Voice Name (Required):

  • Choose a descriptive name for your voice

  • Examples: Professional Narrator, Casual Speaking, Customer Service

  • Make it memorable and purpose-specific

Description (Optional):

  • Add details about the voice characteristics

  • Examples: Male, professional, calm; Female, energetic, friendly

  • Helps you remember the voice’s intended use

Step 3: Audio Source Selection

You have two options for providing voice training data.

Option A: Upload Audio File

Supported Formats:

  • MP3, WAV, M4A, OGG, FLV

File Requirements:

  • Maximum Size: 50MB per file

  • Quality: Higher quality audio produces better voice clones

  • Duration: 30–60 seconds recommended

Upload Methods:

  • Click to Upload or Drag & Drop into the upload area

Option B: Record Audio Directly

Professional Recording Setup:

  • Built-in Recorder: High-quality browser-based recording

  • Audio Enhancement: Automatic noise cancellation and gain control

  • Real-time Timer: See recording duration

Recording Process:

  • Microphone Permission: Grant access

  • Training Text: Read the provided professional script

  • Speak Clearly: Record your voice naturally

  • Preview & Retry: Listen and re-record if needed

Training Text Guidelines

Provided Training Script:
"Hello, this is my voice sample for cloning my voice for Verbatik AI. I am speaking clearly and naturally to provide the best quality training data. This recording will help create an accurate representation of my unique voice characteristics and speaking patterns."

Recording Best Practices:

  • Speak Naturally and Clearly

  • Consistent Volume

  • Aim for 30–60 seconds

  • Quiet Environment


Voice Management

Your Voice Library

Voice Library Overview:

  • Voice Count and Plan Limits

  • Creation Dates

  • Quick Actions: Play, edit, delete

Voice Cards:

  • Voice Name

  • Description

  • Creation Date

  • Play Button

  • Menu Options

Voice Management Actions

Playing Voice Samples:

  • Preview the original audio

  • Verify audio quality

Editing Voice Details:

  • Update Name or Description

  • Save Changes

Deleting Voices:

  • Confirmation Required

  • Permanent Action

  • Cleanup from system


Voice Cloning Usage

Using Your Custom Voices

Accessing Voice Cloning:

  1. Navigate to Text-to-Speech

  2. Select “Voice Cloning” option

  3. Choose your custom voice

Voice Cloning Interface:

Speech Type Toggle:

  • Text to Speech = pre-built voices

  • Voice Cloning = your custom voices

Custom Voice Selection:

  • Dropdown menu with previews

  • Clear guidance if no voices exist

Language Selection:

Multi-language Support:

  • English, Spanish, French, German, Italian, Portuguese, and more

Language Benefits:

  • Consistent Voice Tone Across Languages

  • Global Content Creation

  • Professional Adaptation


Generation Process

Text-to-Speech with Voice Cloning

Input Process:

  1. Enter Text

  2. Select Voice

  3. Select Language

  4. Click Generate

Processing & Results:

  • Advanced Voice Cloning

  • Maintains Voice Identity

  • Multi-language Output

  • Professional Audio Quality

Generation Features:

Automatic Playback:

  • Instant Preview

  • Downloadable Audio

History Integration:

  • Complete History Log

  • Voice Attribution

  • Easy Retrieval


Plan Limits & Upgrades

Voice Creation Limits

Free Plan:

  • Limited Custom Voices

  • Full Access to Features

Paid Plans:

  • More Voices Allowed

  • Unlimited Usage

  • Commercial Rights Included

Plan Limit Indicators:

  • Visual Usage Tracker

  • Upgrade Prompts

  • Plan Comparison Table

Credit System

Voice Cloning Credits:

  • Cost Based on Text Length

  • Same Cost for All Voices

Credit Benefits:

  • Transparent Pricing

  • Real-time Tracking

  • Multiple Plans


Best Practices

Creating High-Quality Voices

Audio Recording Tips:

  • Use a Good Mic

  • Speak Naturally

  • Record in Quiet Spaces

  • 30–60 Seconds is Ideal

Voice Training Optimization:

  • Use Provided Script

  • Re-record for Better Quality

  • Minimize Background Noise

Voice Usage Strategies

Voice Organization:

  • Use Clear Names

  • Add Descriptive Notes

  • Review and Clean Up Regularly

Generation Optimization:

  • Use Clean, Simple Text

  • Match Language to Target Audience

  • Align Voice Style with Content Tone


Troubleshooting

Common Voice Creation Issues

Upload Problems:

  • Unsupported Format

  • File Too Large

  • Poor Audio Quality

Recording Issues:

  • Microphone Access Denied

  • Low Mic Quality

  • Background Noise

Voice Cloning Problems

Generation Failures:

  • Low Credit Balance

  • Long Text Costs More

  • No Voice Selected

Quality Issues:

  • Bad Training Data

  • Overly Complex Text

  • Language Not Matching Voice

Optimization Solutions:

  • Re-record High-Quality Samples

  • Test Short Texts First

  • Clean, Direct Language


Advanced Features

Professional Voice Creation

Voice Characterization:

  • Use-Case Specific Voices

  • Record Different Emotions

  • Targeted Applications

Quality Optimization:

  • Test with Multiple Scripts

  • Refine Based on Output

  • Iterate and Improve

Creative Applications

Content Creation:

  • Personal Brand Audio

  • Multilingual Content Creation

  • Professional Narration

Business Applications:

  • Customer Support Voices

  • Employee Training Narration

  • Voice for Marketing Videos


Integration with Other Features

Sound Studio Integration

Voice Clone + Music:

  • Add Background Music

  • Full Audio Mix Projects

Advanced Mixing:

  • Control Volumes

  • Multi-Track Edits

  • Broadcast-Quality Export

Workflow Integration

Content Pipeline:

  • Turn Text to Voice Quickly

  • Export for Any Platform

  • Batch Audio Creation


Getting Help

If you need assistance:

  1. Review this guide

  2. Follow audio quality best practices

  3. Connect with other users

  4. Contact technical support


Conclusion

Voice Creation & Cloning gives you total control over your digital voice identity. From building professional-grade clones to speaking across multiple languages, this tool is built for creators, brands, educators, and innovators.

Start with one voice and expand as your projects grow. Combine voices with music, sound effects, and TTS capabilities to build complete productions. Invest in quality recordings and you'll unlock the full potential of AI voice generation.

Pro Tip: Create different voices for different use cases. A professional presenter voice, a friendly customer service tone, and a casual narrator voice can help you adapt across content types and platforms.

Did this answer your question?