Voice-Enabled Agents
Give your Agents the ability to speak and listen. Build voice assistants, phone bots, and audio interfaces with OpenAI TTS/Whisper and ElevenLabs.Voice Agent - Complete Example
Copy
import { Agent, createOpenAIVoice } from 'praisonai';
import { readFileSync, writeFileSync } from 'fs';
const voice = createOpenAIVoice();
const agent = new Agent({
name: 'Voice Assistant',
instructions: 'You are a friendly voice assistant. Keep responses concise for natural speech.'
});
// Complete voice interaction loop
async function voiceChat(audioInput: Buffer): Promise<Buffer> {
// 1. Listen: Convert user's speech to text
const userMessage = await voice.listen(audioInput);
console.log('👤 User:', userMessage);
// 2. Think: Agent processes the message
const response = await agent.chat(userMessage);
console.log('🤖 Agent:', response);
// 3. Speak: Convert Agent's response to audio
const audioResponse = await voice.speak(response, { voice: 'nova' });
return audioResponse;
}
// Usage
const userAudio = readFileSync('user-question.mp3');
const agentAudio = await voiceChat(userAudio);
writeFileSync('agent-response.mp3', agentAudio);
Multi-Agent Voice System
Different Agents with different voices:Copy
import { Agent, Agents, createOpenAIVoice, createElevenLabsVoice } from 'praisonai';
const openaiVoice = createOpenAIVoice();
const elevenLabsVoice = createElevenLabsVoice({ apiKey: process.env.ELEVENLABS_API_KEY });
// Agent 1: Greeter with friendly voice
const greeterAgent = new Agent({
name: 'Greeter',
instructions: 'Warmly greet users and understand their needs.',
voice: { provider: openaiVoice, voiceId: 'nova' }
});
// Agent 2: Expert with professional voice
const expertAgent = new Agent({
name: 'Expert',
instructions: 'Provide detailed technical explanations.',
voice: { provider: openaiVoice, voiceId: 'onyx' }
});
// Agent 3: Custom ElevenLabs voice
const brandAgent = new Agent({
name: 'Brand Voice',
instructions: 'Represent the company brand.',
voice: { provider: elevenLabsVoice, voiceId: 'custom-voice-id' }
});
async function multiVoiceConversation(userAudio: Buffer) {
const userText = await openaiVoice.listen(userAudio);
// Greeter handles initial interaction
const greeting = await greeterAgent.chat(userText);
const greetingAudio = await openaiVoice.speak(greeting, { voice: 'nova' });
// Expert provides detailed answer
const explanation = await expertAgent.chat(`Explain in detail: ${userText}`);
const explanationAudio = await openaiVoice.speak(explanation, { voice: 'onyx' });
return { greetingAudio, explanationAudio };
}
Agent with Voice Tools
Give your Agent tools to control voice output:Copy
import { Agent, createTool, createOpenAIVoice } from 'praisonai';
const voice = createOpenAIVoice();
// Tool to speak with specific emotion/style
const speakTool = createTool({
name: 'speak_aloud',
description: 'Speak text aloud with a specific voice style',
parameters: {
type: 'object',
properties: {
text: { type: 'string', description: 'Text to speak' },
voice: { type: 'string', enum: ['alloy', 'echo', 'nova', 'onyx'], description: 'Voice style' },
speed: { type: 'number', description: 'Speed (0.5-2.0)' }
},
required: ['text']
},
execute: async ({ text, voice: voiceId = 'nova', speed = 1.0 }) => {
const audio = await voice.speak(text, { voice: voiceId, speed });
// In real app: play audio or send to client
return `Spoke: "${text}" with ${voiceId} voice`;
}
});
// Tool to listen for user input
const listenTool = createTool({
name: 'listen_for_input',
description: 'Listen for voice input from the user',
parameters: { type: 'object', properties: {} },
execute: async () => {
// In real app: capture audio from microphone
const audioBuffer = await captureAudio();
const transcript = await voice.listen(audioBuffer);
return transcript;
}
});
const voiceAgent = new Agent({
name: 'Interactive Voice Agent',
instructions: `You can speak aloud and listen to users.
Use speak_aloud to respond verbally. Use listen_for_input to hear the user.`,
tools: [speakTool, listenTool]
});
await voiceAgent.chat('Greet the user and ask how you can help');
Phone/IVR Agent
Build an automated phone system:Copy
import { Agent, createOpenAIVoice } from 'praisonai';
const voice = createOpenAIVoice();
const phoneAgent = new Agent({
name: 'Phone Support',
instructions: `You are a phone support agent.
- Keep responses under 30 words for natural phone conversation
- Ask one question at a time
- Confirm important details by repeating them back`
});
class PhoneSession {
private history: { role: string; content: string }[] = [];
async handleCall(audioInput: Buffer): Promise<Buffer> {
// Transcribe caller's speech
const callerMessage = await voice.listen(audioInput);
this.history.push({ role: 'user', content: callerMessage });
// Build context
const context = this.history.map(m => `${m.role}: ${m.content}`).join('\n');
// Get Agent response
const response = await phoneAgent.chat(context);
this.history.push({ role: 'assistant', content: response });
// Convert to speech
return await voice.speak(response, {
voice: 'nova',
speed: 0.9 // Slightly slower for phone clarity
});
}
async startCall(): Promise<Buffer> {
const greeting = await phoneAgent.chat('Start the call with a greeting');
this.history.push({ role: 'assistant', content: greeting });
return await voice.speak(greeting, { voice: 'nova' });
}
}
Podcast/Content Agent
Agent that creates audio content:Copy
import { Agent, createOpenAIVoice, createElevenLabsVoice } from 'praisonai';
import { writeFileSync } from 'fs';
const voice = createOpenAIVoice();
const scriptWriter = new Agent({
name: 'Script Writer',
instructions: 'Write engaging podcast scripts with natural conversational flow.'
});
const narrator = new Agent({
name: 'Narrator',
instructions: 'You narrate content. Add appropriate pauses with "..." for dramatic effect.'
});
async function createPodcastEpisode(topic: string) {
// Agent writes the script
const script = await scriptWriter.chat(`Write a 2-minute podcast intro about: ${topic}`);
// Agent refines for narration
const narrationScript = await narrator.chat(`Prepare this for voice narration: ${script}`);
// Generate audio
const audio = await voice.speak(narrationScript, {
voice: 'fable', // Expressive voice for narration
speed: 0.95
});
writeFileSync(`podcast-${Date.now()}.mp3`, audio);
return { script: narrationScript, audioPath: `podcast-${Date.now()}.mp3` };
}
Supported Providers
| Provider | TTS | STT | Best For |
|---|---|---|---|
| OpenAI | ✅ | ✅ (Whisper) | General purpose Agents |
| ElevenLabs | ✅ | ❌ | Custom brand voices |
OpenAI Voices
| Voice | Style | Best For |
|---|---|---|
nova | Female, friendly | Customer service Agents |
alloy | Neutral | General purpose |
onyx | Male, deep | Authority/expertise |
shimmer | Female, clear | Instructions/tutorials |
echo | Male, warm | Conversational |
fable | Expressive | Storytelling/content |
Environment Variables
Copy
OPENAI_API_KEY=sk-...
ELEVENLABS_API_KEY=...

