AACFlow

Text-to-Speech

Convert text to speech using AI voices

Usage Instructions

Generate natural-sounding speech from text using state-of-the-art AI voices from OpenAI, Deepgram, ElevenLabs, Cartesia, Google Cloud, Azure, and PlayHT. Supports multiple voices, languages, and audio formats.

Tools

tts_openai

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
modelstringNoOpenAI TTS model identifier (e.g., "tts-1", "tts-1-hd", "gpt-4o-mini-tts")
voicestringNoOpenAI voice identifier (e.g., "alloy", "ash", "ballad", "coral", "echo", "sage", "shimmer")
responseFormatstringNoNo description
speednumberNoSpeech speed multiplier from 0.25 to 4.0 (e.g., 0.5 for slower, 1.0 for normal, 2.0 for faster)

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_deepgram

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
modelstringNoDeepgram model/voice identifier (e.g., "aura-asteria-en", "aura-luna-en", "aura-2-luna-en")
voicestringNoDeepgram voice identifier, alternative to model param (e.g., "aura-asteria-en", "aura-orion-en")
encodingstringNoNo description
sampleRatenumberNoNo description
bitRatenumberNoNo description
containerstringNoNo description

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_elevenlabs

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
voiceIdstringYesElevenLabs voice identifier (e.g., "21m00Tcm4TlvDq8ikWAM", "AZnzlk1XvdvUeBnXmlld")
apiKeystringYesNo description
modelIdstringNoElevenLabs model identifier (e.g., "eleven_turbo_v2_5", "eleven_flash_v2_5", "eleven_multilingual_v2")
stabilitynumberNoNo description
similarityBoostnumberNoNo description
stylenumberNoNo description
useSpeakerBoostbooleanNoNo description

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_cartesia

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
modelIdstringNoCartesia model identifier (e.g., "sonic", "sonic-2", "sonic-3", "sonic-multilingual")
voicestringNoCartesia voice identifier or embedding (e.g., "a0e99841-438c-4a64-b679-ae501e7d6091")
languagestringNoLanguage code for speech synthesis (e.g., "en", "es", "fr", "de", "it", "pt")
outputFormatjsonNoNo description
speednumberNoNo description
emotionarrayNoEmotion tags for Sonic-3 (e.g., ['positivity:high'])

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_google

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
voiceIdstringNoGoogle Cloud voice identifier (e.g., "en-US-Neural2-A", "en-US-Wavenet-D", "en-GB-Neural2-B")
languageCodestringYesBCP-47 language code for speech synthesis (e.g., "en-US", "es-ES", "fr-FR", "de-DE")
genderstringNoNo description
audioEncodingstringNoNo description
speakingRatenumberNoSpeaking rate multiplier from 0.25 to 2.0 (e.g., 0.5 for slower, 1.0 for normal, 1.5 for faster)
pitchnumberNoNo description
volumeGainDbnumberNoNo description
sampleRateHertznumberNoNo description
effectsProfileIdarrayNoEffects profile (e.g., ['headphone-class-device'])

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_azure

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
voiceIdstringNoAzure voice identifier (e.g., "en-US-JennyNeural", "en-US-GuyNeural", "en-GB-SoniaNeural")
regionstringNoNo description
outputFormatstringNoNo description
ratestringNoNo description
pitchstringNoNo description
stylestringNoNo description
styleDegreenumberNoNo description
rolestringNoNo description

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

tts_playht

Input

ParameterTypeRequiredDescription
textstringYesThe text content to convert to speech (e.g., "Hello, welcome to our service!")
apiKeystringYesNo description
userIdstringYesNo description
voicestringNoPlayHT voice identifier or manifest URL (e.g., "s3://voice-cloning-zero-shot/...")
qualitystringNoNo description
outputFormatstringNoNo description
speednumberNoSpeech speed multiplier from 0.5 to 2.0 (e.g., 0.5 for slower, 1.0 for normal, 1.5 for faster)
temperaturenumberNoNo description
voiceGuidancenumberNoNo description
textGuidancenumberNoNo description
sampleRatenumberNoNo description

Output

ParameterTypeDescription
audioUrlstringURL to the generated audio file
audioFilefileGenerated audio file object
durationnumberAudio duration in seconds
characterCountnumberNumber of characters processed
formatstringAudio format
providerstringTTS provider used

On this page

Start building today
Trusted by over 100,000 builders.
The SaaS platform to build AI agents and run your agentic workforce.
Get started