AACFlow

Yandex SpeechKit

Synthesize speech and recognize audio via Yandex SpeechKit

ya

Yandex SpeechKit is Yandex Cloud's speech technology platform offering text-to-speech (TTS) synthesis and speech-to-text (STT) recognition for Russian and other languages.

With the Yandex SpeechKit integration in AACFlow, you can:

  • Text-to-Speech (Synthesize): Convert text to audio in various voices and formats
  • Speech-to-Text (Recognize Short): Transcribe short audio files (under 1 minute) to text

This integration enables automated voice response systems, audio content generation, and speech-driven workflow triggers.

Usage Instructions

Integrate Yandex SpeechKit into the workflow to add voice synthesis or audio transcription. Requires a Yandex Cloud IAM token. Obtain IAM tokens using the Yandex Cloud IAM block in your workflow.

Tools

yandex_speechkit_tts

Text-to-speech synthesis

Input

ParameterTypeRequiredDescription
iamTokenstringYesYandex Cloud IAM token
textstringYesText to convert to speech
voicestringNoVoice name (e.g., oksana, alena, filipp)
speednumberNoSpeech speed (0.1–3.0, default 1.0)
formatstringNoAudio format: ogg_opus, lpcm, mp3

Output

ParameterTypeDescription
audioDatastringBase64-encoded audio data
mimeTypestringAudio MIME type

yandex_speechkit_stt

Speech-to-text recognition (short audio)

Input

ParameterTypeRequiredDescription
iamTokenstringYesYandex Cloud IAM token
audioDatastringYesBase64-encoded audio data
languagestringNoLanguage code (ru-RU, en-US)

Output

ParameterTypeDescription
textstringRecognized text
confidencenumberRecognition confidence score

On this page

Start building today
Trusted by over 100,000 builders.
The SaaS platform to build AI agents and run your agentic workforce.
Get started