Voice Creation & Cloning Tutorial
Complete guide to creating and using custom voice models with Verbatik AI
Voice Creation & Cloning Tutorial
Written by Verbatik Support
Updated this week
Welcome to Verbatik's Voice Creation & Cloning system – a powerful suite of features that allows you to create custom voice models and use them for AI text-to-speech generation! This comprehensive guide covers both voice creation (training your own voice models) and voice cloning (using those models for speech synthesis).
What is Voice Creation & Cloning?
Voice Creation & Cloning is an advanced AI system that enables you to:
- Create Custom Voices: Upload audio samples or record your voice to train AI models
- Voice Cloning: Use your custom voices for text-to-speech generation
- Voice Library Management: Organize and manage your personal voice collection
- Multi-language Support: Use your voice clones in multiple languages
- Professional Quality: Generate speech that maintains your unique voice characteristics
Getting Started
Accessing Your Voice Library
- Navigate to My Voices: From your dashboard, click on "My Voices" in the sidebar navigation.
- Voice Management Hub: This is your central hub for creating, managing, and using custom voices.
- Interface Overview: View your voice library, create new voices, and manage existing ones.
Voice Creation Process
Creating Your First Voice
If you don't have any voices yet, you'll see a welcome screen with guidance.
Getting Started Screen:
- Create First Voice Button: Prominent call-to-action to begin voice creation
- Helpful Guidance: Clear explanation of the voice creation process
- Professional Design: Clean, encouraging interface to get you started
Adding New Voices
Step 1: Access Voice Creation
- Add Voice Button: Click the "Add Voice" button in your voice library
- Plan Limits: See your current voice limit (varies by subscription plan)
- Upgrade Options: Clear guidance if you've reached your plan limit
Step 2: Voice Information
Voice Name (Required):
- Choose a descriptive name for your voice
- Examples: Professional Narrator, Casual Speaking, Customer Service
- Make it memorable and purpose-specific
Description (Optional):
- Add details about the voice characteristics
- Examples: Male, professional, calm; Female, energetic, friendly
- Helps you remember the voice's intended use
Step 3: Audio Source Selection
You have two options for providing voice training data.
Option A: Upload Audio File
Supported Formats:
- MP3, WAV, M4A, OGG, FLV
File Requirements:
- Maximum Size: 50MB per file
- Quality: Higher quality audio produces better voice clones
- Duration: 30–60 seconds recommended
Upload Methods:
- Click to Upload or Drag & Drop into the upload area
Option B: Record Audio Directly
Professional Recording Setup:
- Built-in Recorder: High-quality browser-based recording
- Audio Enhancement: Automatic noise cancellation and gain control
- Real-time Timer: See recording duration
Recording Process:
- Microphone Permission: Grant access
- Training Text: Read the provided professional script
- Speak Clearly: Record your voice naturally
- Preview & Retry: Listen and re-record if needed
Training Text Guidelines
Provided Training Script:
"Hello, this is my voice sample for cloning my voice for Verbatik AI. I am speaking clearly and naturally to provide the best quality training data. This recording will help create an accurate representation of my unique voice characteristics and speaking patterns."
Recording Best Practices:
- Speak Naturally and Clearly
- Consistent Volume
- Aim for 30–60 seconds
- Quiet Environment
Voice Management
Your Voice Library
Voice Library Overview:
- Voice Count and Plan Limits
- Creation Dates
- Quick Actions: Play, edit, delete
Voice Cards:
- Voice Name
- Description
- Creation Date
- Play Button
- Menu Options
Voice Management Actions
Playing Voice Samples:
- Preview the original audio
- Verify audio quality
Editing Voice Details:
- Update Name or Description
- Save Changes
Deleting Voices:
- Confirmation Required
- Permanent Action
- Cleanup from system
Voice Cloning Usage
Using Your Custom Voices
Accessing Voice Cloning:
- Navigate to Text-to-Speech
- Select "Voice Cloning" option
- Choose your custom voice
Voice Cloning Interface:
Speech Type Toggle:
- Text to Speech = pre-built voices
- Voice Cloning = your custom voices
Custom Voice Selection:
- Dropdown menu with previews
- Clear guidance if no voices exist
Language Selection:
- Multi-language Support: English, Spanish, French, German, Italian, Portuguese, and more
- Language Benefits:
- Consistent Voice Tone Across Languages
- Global Content Creation
- Professional Adaptation
Generation Process
Text-to-Speech with Voice Cloning
Input Process:
- Enter Text
- Select Voice
- Select Language
- Click Generate
Processing & Results:
- Advanced Voice Cloning
- Maintains Voice Identity
- Multi-language Output
- Professional Audio Quality
Generation Features:
- Automatic Playback: Instant Preview
- Downloadable Audio: High-quality export
- History Integration: Complete History Log with Voice Attribution
Plan Limits & Upgrades
Voice Creation Limits
Free Plan:
- Limited Custom Voices
- Full Access to Features
Paid Plans:
- More Voices Allowed
- Unlimited Usage
- Commercial Rights Included
Plan Limit Indicators:
- Visual Usage Tracker
- Upgrade Prompts
- Plan Comparison Table
Credit System
Voice Cloning Credits:
- Cost Based on Text Length
- Same Cost for All Voices
Credit Benefits:
- Transparent Pricing
- Real-time Tracking
- Multiple Plans
Best Practices
Creating High-Quality Voices
Audio Recording Tips:
- Use a Good Mic
- Speak Naturally
- Record in Quiet Spaces
- 30–60 Seconds is Ideal
Voice Training Optimization:
- Use Provided Script
- Re-record for Better Quality
- Minimize Background Noise
Voice Usage Strategies
Voice Organization:
- Use Clear Names
- Add Descriptive Notes
- Review and Clean Up Regularly
Generation Optimization:
- Use Clean, Simple Text
- Match Language to Target Audience
- Align Voice Style with Content Tone
Troubleshooting
Common Voice Creation Issues
Upload Problems:
- Unsupported Format
- File Too Large
- Poor Audio Quality
Recording Issues:
- Microphone Access Denied
- Low Mic Quality
- Background Noise
Voice Cloning Problems
Generation Failures:
- Low Credit Balance
- Long Text Costs More
- No Voice Selected
Quality Issues:
- Bad Training Data
- Overly Complex Text
- Language Not Matching Voice
Optimization Solutions:
- Re-record High-Quality Samples
- Test Short Texts First
- Clean, Direct Language
Advanced Features
Professional Voice Creation
Voice Characterization:
- Use-Case Specific Voices
- Record Different Emotions
- Targeted Applications
Quality Optimization:
- Test with Multiple Scripts
- Refine Based on Output
- Iterate and Improve
Creative Applications
Content Creation:
- Personal Brand Audio
- Multilingual Content Creation
- Professional Narration
Business Applications:
- Customer Support Voices
- Employee Training Narration
- Voice for Marketing Videos
Integration with Other Features
Sound Studio Integration
Voice Clone + Music:
- Add Background Music
- Full Audio Mix Projects
Advanced Mixing:
- Control Volumes
- Multi-Track Edits
- Broadcast-Quality Export
Workflow Integration:
- Turn Text to Voice Quickly
- Export for Any Platform
- Batch Audio Creation
Getting Help
If you need assistance:
- Review this guide
- Follow audio quality best practices
- Connect with other users
- Contact technical support
Conclusion
Voice Creation & Cloning gives you total control over your digital voice identity. From building professional-grade clones to speaking across multiple languages, this tool is built for creators, brands, educators, and innovators.
Start with one voice and expand as your projects grow. Combine voices with music, sound effects, and TTS capabilities to build complete productions. Invest in quality recordings and you'll unlock the full potential of AI voice generation.
Pro Tip: Create different voices for different use cases. A professional presenter voice, a friendly customer service tone, and a casual narrator voice can help you adapt across content types and platforms.
Next Steps
- Learn about text-to-speech basics
- Explore API documentation
- Check out billing and pricing