Skip to Content

Voice Cloning

Create custom AI voices and generate speech with ElevenLabs integration.

Beta

Overview

Voice cloning on elizaOS Cloud enables you to:

Quick Start

Dashboard

Navigate to Dashboard → Voices for the visual interface.

API

# Clone a voice from audio samples curl -X POST "https://cloud.milady.ai/api/elevenlabs/voices/clone" \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "name=My Voice Clone" \ -F "files=@sample1.mp3" \ -F "files=@sample2.mp3"

Voice Cloning

Prepare Audio Samples

Gather 1-3 minutes of clean audio from the target voice.

Requirements:

  • Clear speech without background noise
  • Single speaker only
  • High quality (WAV or MP3, 44.1kHz+)

Upload Samples

Upload audio files via dashboard or API.

Create Clone

Submit the cloning request and wait for processing.

Verify Quality

Test the cloned voice with sample text.

Sample Requirements

RequirementRecommendation
Duration1-3 minutes total
FormatWAV, MP3, M4A
Quality44.1kHz, 16-bit minimum
ContentNatural speech, varied intonation
NoiseMinimal background noise

Using someone’s voice without permission may violate their rights. Only clone voices you have rights to use.

Text-to-Speech

Generate Speech

const response = await fetch("https://cloud.milady.ai/api/elevenlabs/tts", { method: "POST", headers: { Authorization: "Bearer YOUR_API_KEY", "Content-Type": "application/json", }, body: JSON.stringify({ text: "Welcome to elizaOS Cloud!", voice_id: "voice_abc123", model_id: "eleven_multilingual_v2", voice_settings: { stability: 0.5, similarity_boost: 0.75, }, }), }); const audioBlob = await response.blob(); const audioUrl = URL.createObjectURL(audioBlob);

Voice Settings

SettingRangeDescription
stability0-1Higher = more consistent, lower = more expressive
similarity_boost0-1How closely to match the original voice
style0-1Style exaggeration (v2 models only)
use_speaker_boostboolEnhance speaker similarity

Available Models

ModelLanguagesQualitySpeed
eleven_multilingual_v229HighestMedium
eleven_monolingual_v1EnglishHighFast
eleven_turbo_v2EnglishGoodFastest

Pre-built Voices

elizaOS Cloud provides pre-built voices:

curl -X GET "https://cloud.milady.ai/api/elevenlabs/voices" \ -H "Authorization: Bearer YOUR_API_KEY"
{ "voices": [ { "voice_id": "21m00Tcm4TlvDq8ikWAM", "name": "Rachel", "labels": { "accent": "american", "age": "young" }, "preview_url": "https://..." }, { "voice_id": "AZnzlk1XvdvUeBnXmlld", "name": "Domi", "labels": { "accent": "american", "age": "young" } } ] }

Voice Management

Get Voice Details

curl -X GET "https://cloud.milady.ai/api/elevenlabs/voices/voice_abc123" \ -H "Authorization: Bearer YOUR_API_KEY"

Delete Voice

curl -X DELETE "https://cloud.milady.ai/api/elevenlabs/voices/voice_abc123" \ -H "Authorization: Bearer YOUR_API_KEY"

Check Clone Status

curl -X GET "https://cloud.milady.ai/api/elevenlabs/voices/jobs" \ -H "Authorization: Bearer YOUR_API_KEY"

Agent Integration

Use cloned voices with your agents:

{ "name": "Voice Assistant", "bio": ["Helpful AI assistant with custom voice"], "settings": { "voice": { "provider": "elevenlabs", "voiceId": "voice_abc123", "model": "eleven_multilingual_v2" } } }

Speech-to-Text

Convert audio to text:

curl -X POST "https://cloud.milady.ai/api/elevenlabs/stt" \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "file=@audio.mp3"
{ "text": "This is the transcribed text from the audio.", "confidence": 0.95, "words": [ { "word": "This", "start": 0.0, "end": 0.2, "confidence": 0.98 }, { "word": "is", "start": 0.2, "end": 0.3, "confidence": 0.99 } ] }

Pricing

See Billing & Credits for current pricing. Voice cloning costs 5 credits, TTS/STT are usage-based.

Monitor your voice usage in the billing dashboard.

Best Practices

Next Steps