AudioAgent

Defined in the Audio Agent module.

AI Agent A specialized agent for audio processing using AI models. Provides:

Text-to-Speech (TTS): Convert text to spoken audio
Speech-to-Text (STT): Transcribe audio to text

TTS Providers:

OpenAI: openai/tts-1, openai/tts-1-hd
Azure: azure/tts-1
Gemini: gemini/gemini-2.5-flash-preview-tts
Vertex AI: vertex_ai/gemini-2.5-flash-preview-tts
ElevenLabs: elevenlabs/eleven_multilingual_v2
MiniMax: minimax/speech-01

STT Providers:

OpenAI: openai/whisper-1
Azure: azure/whisper
Groq: groq/whisper-large-v3
Deepgram: deepgram/nova-2
Gemini: gemini/gemini-2.0-flash

Constructor

name

Optional

No description available.

instructions

Optional

No description available.

llm

Optional

No description available.

model

Optional

No description available.

base_url

Optional

No description available.

api_key

Optional

No description available.

audio

Optional

No description available.

verbose

Union

default:"True"

No description available.

Methods

console()

Lazily initialize Rich Console.

litellm()

Lazy load litellm module when needed.

speech()

Convert text to speech.

aspeech()

Async version of speech().

transcribe()

Transcribe audio to text.

atranscribe()

Async version of transcribe().

say()

Quick TTS - convert text and save to file.

asay()

Async version of say().

listen()

Quick STT - transcribe audio file.

alisten()

Async version of listen().

Usage

from praisonaiagents import AudioAgent
    
    # Text-to-Speech
    agent = AudioAgent(llm="openai/tts-1")
    agent.speech("Hello world!", output="hello.mp3")
    
    # Speech-to-Text
    agent = AudioAgent(llm="openai/whisper-1")
    text = agent.transcribe("audio.mp3")
    print(text)

Source

View on GitHub

praisonaiagents/agent/audio_agent.py at line 67

Guide

Reference

Audio Agent • AI Agent SDK

AudioAgent

Constructor

Methods

console()

litellm()

speech()

aspeech()

transcribe()

atranscribe()

say()

asay()

listen()

alisten()

Usage

Source

View on GitHub

Agents Concept

Single Agent Guide

Multi-Agent Guide

Agent Configuration

Auto Agents

​AudioAgent

​Constructor

​Methods

console()

litellm()

speech()

aspeech()

transcribe()

atranscribe()

say()

asay()

listen()

alisten()

​Usage

​Source

View on GitHub

​Related Documentation

Agents Concept

Single Agent Guide

Multi-Agent Guide

Agent Configuration

Auto Agents

AudioAgent

Constructor

Methods

Usage

Source

Related Documentation