Audio Tools

Enable your agents to speak and listen with TTS and STT tools.

Quick Start

Bot CLI
Python

# Enable TTS tool for your bot
praisonai bot telegram --token $TOKEN --tts

# Enable both TTS and STT
praisonai bot telegram --token $TOKEN --tts --stt

# Auto-convert all responses to speech
praisonai bot telegram --token $TOKEN --auto-tts

from praisonai.tools.audio import create_tts_tool, create_stt_tool
from praisonaiagents import Agent

# Create agent with audio tools
agent = Agent(
    name="voice-assistant",
    instructions="You can speak and listen.",
    tools=[create_tts_tool(), create_stt_tool()]
)

# Agent can now use tts() and stt() tools
response = agent.chat("Say hello in audio format")

TTS Tool

Convert text to speech and get an audio file.

Usage

from praisonai.tools.audio import tts_tool

result = tts_tool("Hello world!", voice="alloy")

if result["success"]:
    print(result["audio_path"])   # /tmp/tts_abc123.mp3
    print(result["media_line"])   # MEDIA:/tmp/tts_abc123.mp3

Options

Parameter	Type	Default	Description
`text`	`str`	required	Text to convert
`voice`	`str`	`"alloy"`	Voice: alloy, echo, fable, onyx, nova, shimmer
`model`	`str`	`"openai/tts-1"`	TTS model
`output_format`	`str`	`"mp3"`	Format: mp3, opus, aac, flac, wav
`output_dir`	`str`	temp dir	Directory to save audio

STT Tool

Transcribe audio files to text.

Usage

from praisonai.tools.audio import stt_tool

result = stt_tool("recording.mp3", language="en")

if result["success"]:
    print(result["text"])  # Transcribed text

Options

Parameter	Type	Default	Description
`audio_path`	`str`	required	Path to audio file
`language`	`str`	auto	Language code (en, es, fr, etc.)
`model`	`str`	`"openai/whisper-1"`	STT model

Bot CLI Options

Enable audio tools when starting bots:

Option	Description
`--tts`	Enable TTS tool
`--tts-voice VOICE`	Voice (alloy, echo, fable, onyx, nova, shimmer)
`--tts-model MODEL`	TTS model (default: openai/tts-1)
`--auto-tts`	Auto-convert all responses to speech
`--stt`	Enable STT tool
`--stt-model MODEL`	STT model (default: openai/whisper-1)

Examples

# Basic TTS
praisonai bot telegram --token $TOKEN --tts

# Custom voice
praisonai bot telegram --token $TOKEN --tts --tts-voice nova

# Auto-TTS mode (all responses become audio)
praisonai bot telegram --token $TOKEN --auto-tts

# Full audio capabilities
praisonai bot telegram --token $TOKEN --tts --stt --auto-tts

Supported Providers

Audio tools use the core AudioAgent which supports multiple providers via LiteLLM:

TTS Providers

Provider	Model	Notes
OpenAI	`openai/tts-1`, `openai/tts-1-hd`	Default, high quality
Azure	`azure/tts-1`	Enterprise
ElevenLabs	`elevenlabs/eleven_multilingual_v2`	Premium voices
Gemini	`gemini/gemini-2.5-flash-preview-tts`	Google

STT Providers

Provider	Model	Notes
OpenAI	`openai/whisper-1`	Default, accurate
Azure	`azure/whisper`	Enterprise
Groq	`groq/whisper-large-v3`	Fast
Deepgram	`deepgram/nova-2`	Real-time

Voice Options

Available voices for OpenAI TTS:

Voice	Description
`alloy`	Neutral, balanced (default)
`echo`	Warm, conversational
`fable`	Expressive, storytelling
`onyx`	Deep, authoritative
`nova`	Friendly, upbeat
`shimmer`	Clear, professional

Architecture

Audio tools are in the wrapper layer (praisonai), not the core SDK. They wrap the core AudioAgent for easy use with agents and bots.

Bot CLI

Full bot CLI reference

AudioAgent

Core AudioAgent class

Getting Started

Core Concepts

Guides

Features

Models

Databases

Observability

Memory

Knowledge

RAG

Persistence

Tools

Other Features

Developers

Configuration

Best Practices

Getting Started (No Code)

Quick Start

TTS Tool

Usage

Options

STT Tool

Usage

Options

Bot CLI Options

Examples

Supported Providers

Voice Options

Architecture

Bot CLI

AudioAgent

Getting Started

Core Concepts

Guides

Features

Models

Databases

Observability

Memory

Knowledge

RAG

Persistence

Tools

Other Features

Developers

Configuration

Best Practices

Getting Started (No Code)

​Quick Start

​TTS Tool

​Usage

​Options

​STT Tool

​Usage

​Options

​Bot CLI Options

​Examples

​Supported Providers

​Voice Options

​Architecture

​Related

Bot CLI

AudioAgent

Quick Start

TTS Tool

Usage

Options

STT Tool

Usage

Options

Bot CLI Options

Examples

Supported Providers

Voice Options

Architecture

Related