Audio Overview - PraisonAI

Building a chat bot that transcribes voice notes on Telegram / Slack / WhatsApp? See Voice Notes (Speech-to-Text) — the gateway feature that wraps AudioAgent for you.

Text-to-Speech

Basic
Advanced

from praisonaiagents import AudioAgent

agent = AudioAgent(llm="openai/tts-1")
agent.say("Hello!", output="hello.mp3")

from praisonaiagents import AudioAgent

agent = AudioAgent(llm="openai/tts-1-hd")
agent.speech("Hello!", voice="nova", speed=1.2, output="hello.mp3")

# Voices: alloy, echo, fable, onyx, nova, shimmer

Speech-to-Text

Basic
Advanced

from praisonaiagents import AudioAgent

agent = AudioAgent(llm="openai/whisper-1")
text = agent.listen("audio.mp3")
print(text)

from praisonaiagents import AudioAgent

agent = AudioAgent(llm="groq/whisper-large-v3")  # 10x faster
text = agent.transcribe("audio.mp3", language="en")
print(text)

Providers

OpenAI

TTS + STT

Groq

Fast STT

ElevenLabs

Premium TTS

Deepgram

STT

RunwayML Gen-4

OpenAI Audio

​Text-to-Speech

​Speech-to-Text

​Providers

OpenAI

Groq

ElevenLabs

Deepgram

Text-to-Speech

Speech-to-Text

Providers