Speech-to-Text

Convert speech to text using AI

Usage Instructions

Transcribe audio and video files to text using leading AI providers. Supports multiple languages, timestamps, and speaker diarization.

Tools

`stt_whisper`

Input

Parameter	Type	Required	Description
`provider`	string	Yes	No description
`apiKey`	string	Yes	No description
`model`	string	No	No description
`audioFile`	file	No	No description
`audioFileReference`	file	No	No description
`audioUrl`	string	No	No description
`language`	string	No	Language code (e.g., "en", "es", "fr") or "auto" for auto-detection
`timestamps`	string	No	No description
`translateToEnglish`	boolean	No	No description
`prompt`	string	No	Optional text to guide the model's style or continue a previous audio segment. Helps with proper nouns and context.
`temperature`	number	No	Sampling temperature between 0 and 1. Higher values make output more random, lower values more focused and deterministic.
`responseFormat`	string	No	Output format for the transcription (e.g., "json", "text", "srt", "verbose_json", "vtt")

Output

This tool does not produce any outputs.

`stt_deepgram`

Input

Parameter	Type	Required	Description
`provider`	string	Yes	No description
`apiKey`	string	Yes	No description
`model`	string	No	No description
`audioFile`	file	No	No description
`audioFileReference`	file	No	No description
`audioUrl`	string	No	No description
`language`	string	No	Language code (e.g., "en", "es", "fr") or "auto" for auto-detection
`timestamps`	string	No	No description
`diarization`	boolean	No	No description

Output

This tool does not produce any outputs.

`stt_elevenlabs`

Input

Parameter	Type	Required	Description
`provider`	string	Yes	No description
`apiKey`	string	Yes	No description
`model`	string	No	No description
`audioFile`	file	No	No description
`audioFileReference`	file	No	No description
`audioUrl`	string	No	No description
`language`	string	No	Language code (e.g., "en", "es", "fr") or "auto" for auto-detection
`timestamps`	string	No	No description

Output

This tool does not produce any outputs.

`stt_assemblyai`

Input

Parameter	Type	Required	Description
`provider`	string	Yes	No description
`apiKey`	string	Yes	No description
`model`	string	No	No description
`audioFile`	file	No	No description
`audioFileReference`	file	No	No description
`audioUrl`	string	No	No description
`language`	string	No	Language code (e.g., "en", "es", "fr") or "auto" for auto-detection
`timestamps`	string	No	No description
`diarization`	boolean	No	No description
`sentiment`	boolean	No	No description
`entityDetection`	boolean	No	No description
`piiRedaction`	boolean	No	No description
`summarization`	boolean	No	No description

Output

This tool does not produce any outputs.

`stt_gemini`

Input

Parameter	Type	Required	Description
`provider`	string	Yes	No description
`apiKey`	string	Yes	No description
`model`	string	No	No description
`audioFile`	file	No	No description
`audioFileReference`	file	No	No description
`audioUrl`	string	No	No description
`language`	string	No	Language code (e.g., "en", "es", "fr") or "auto" for auto-detection
`timestamps`	string	No	No description

Output

This tool does not produce any outputs.

AWS STS Supabase

On this page

Usage Instructions

stt_elevenlabs

stt_assemblyai

Start building today

Trusted by over 100,000 builders.

The SaaS platform to build AI agents and run your agentic workforce.