Profile API - PraisonAI

The profile command provides detailed cProfile-based profiling for query execution, showing per-function and per-file timing, call graphs, and latency metrics.

Quick Start

# Profile a query with detailed timing breakdown
praisonai profile query "What is 2+2?"

# Profile with file grouping
praisonai profile query "Hello" --show-files --limit 20

# Profile startup time
praisonai profile startup

# Profile import times
praisonai profile imports

Subcommands

Query

Profile a query execution with detailed timing breakdown.

praisonai profile query "Your prompt here"

Options:

Option	Description
`--model, -m`	Model to use
`--stream/--no-stream`	Use streaming mode
`--deep`	Enable deep call tracing (higher overhead)
`--limit, -n`	Top N functions to show (default: 30)
`--sort, -s`	Sort by: cumulative or tottime
`--show-files`	Group timing by file/module
`--show-callers`	Show caller functions
`--show-callees`	Show callee functions
`--importtime`	Show module import times
`--first-token`	Track time to first token (streaming)
`--save`	Save artifacts to path (.prof, .txt)
`--format, -f`	Output format: text or json

Example:

praisonai profile query "Write a poem about AI" --show-files --limit 15

Output:

======================================================================
PraisonAI Profile Report
======================================================================

## System Information
  Timestamp:        2025-12-31T17:37:46.662247Z
  Python Version:   3.12.11
  Platform:         macOS-15.7.4-arm64-arm-64bit
  PraisonAI:        2.9.2
  Model:            default

## Timing Breakdown
  CLI Parse:              0.00 ms
  Imports:              867.21 ms
  Agent Construct:        0.06 ms
  Model Init:             0.00 ms
  Total Run:           2302.64 ms

## Per-Function Timing (Top Functions)
----------------------------------------------------------------------
Function                               Calls Cumulative (ms)    Self (ms)
----------------------------------------------------------------------
start                                      1      2302.57         0.03
chat                                       1      2302.54         0.03
_chat_completion                           1      2302.45         0.02
...
======================================================================

Imports

Profile module import times to identify slow imports.

praisonai profile imports

Output:

======================================================================
Import Time Analysis
======================================================================
Module                                        Self (μs)  Cumul (μs)
----------------------------------------------------------------------
praisonaiagents                                   12345      123456
praisonaiagents.agent                              5432       98765
...
----------------------------------------------------------------------
Total import time: 123.45 ms

Startup

Profile CLI startup time.

praisonai profile startup

Output:

==================================================
Startup Time Analysis
==================================================
Cold Start:       60.81 ms
Warm Start:       61.40 ms
==================================================

Advanced Usage

Deep Call Tracing

Enable deep call tracing for detailed call graph analysis:

praisonai profile query "Test" --deep --show-callers --show-callees

Deep call tracing adds significant overhead. Use only for detailed debugging.

Save Artifacts

Save profiling artifacts for later analysis:

praisonai profile query "Test" --save ./profile_results

This creates:

profile_results.prof - Binary cProfile data (can be loaded with pstats)
profile_results.txt - Human-readable report

JSON Output

Get machine-readable output for CI/CD integration:

praisonai profile query "Test" --format json > profile.json

Streaming with First Token Tracking

Track time to first token in streaming mode:

praisonai profile query "Test" --stream --first-token

Python API

You can also use the profiler programmatically:

from praisonai.cli.features.profiler import (
    ProfilerConfig,
    QueryProfiler,
    format_profile_report,
)

# Configure profiler
config = ProfilerConfig(
    deep=False,
    limit=20,
    show_files=True,
)

# Run profiled query
profiler = QueryProfiler(config)
result = profiler.profile_query("What is 2+2?", model="gpt-4o-mini")

# Print report
print(format_profile_report(result, config))

# Access timing data
print(f"Total time: {result.timing.total_run_ms:.2f} ms")
print(f"Imports: {result.timing.imports_ms:.2f} ms")

Use Cases

Identify Slow Imports

# Find which modules are slowing down startup
praisonai profile imports

Optimize Agent Performance

# Profile with file grouping to find hotspots
praisonai profile query "Complex task" --show-files --limit 30

Debug Latency Issues

# Track time to first token for streaming
praisonai profile query "Test" --stream --first-token

CI/CD Integration

# Export JSON for automated analysis
praisonai profile query "Test" --format json --save ./ci_profile

Safety Notes

Secrets are redacted from profiling output (API keys, tokens)
Deep tracing is opt-in due to overhead
No prompt logging unless explicitly saved
Safe by default - minimal overhead in normal mode

Profile Suite

Run comprehensive profiling across multiple scenarios:

# Full suite (4 scenarios, 3 iterations each)
praisonai profile suite

# Quick mode (2 scenarios, 1 iteration)
praisonai profile suite --quick

# Custom output directory
praisonai profile suite --output ./my_profile_results

# More iterations for statistical significance
praisonai profile suite --iterations 5

Output Files:

suite_results.json - Machine-readable JSON with all timing data
suite_report.txt - Human-readable summary report

Scenarios Tested:

simple_non_stream - Simple prompt, non-streaming
simple_stream - Simple prompt, streaming
medium_non_stream - Medium prompt, non-streaming
medium_stream - Medium prompt, streaming

Performance Analysis Report

Observed Timing Breakdown

Based on profiling runs, here’s where time is spent:

Phase	Time (ms)	% of Total
CLI Startup	60-400	1-8%
Import praisonaiagents	1300-2800	25-55%
Agent Construction	0.1-1	0.1%
Model API Call	2000-5000	40-70%
Total	2300-7000	100%

Import Time Hotspots

Top modules by import time:

Module	Time (ms)	Notes
`praisonaiagents`	2700-3500	Root import
`openai`	1300-1400	OpenAI SDK
`openai.types`	1100-1200	Type definitions
`openai.types.batch`	600-700	Batch types
`openai._models`	250-650	Pydantic models

Function Time Hotspots

Top functions by cumulative time:

Function	File	Time (ms)
`start`	agent.py	2300-6900
`chat`	agent.py	2300-6900
`_chat_completion`	agent.py	2300-6900
`create`	completions.py	2000-4200
`send`	_client.py	2000-4200

Root Causes

Heavy OpenAI SDK imports (~1.3s)
- Pydantic model validation at import time
- Type definitions loaded eagerly
Network latency (~2-5s)
- API round-trip dominates total time
- Cannot be optimized locally
Streaming vs Non-streaming
- Streaming shows faster time-to-first-token
- Total time similar or slightly better

Optimization Opportunities

Tier 0 (Safe, Fast Wins):

Lazy import OpenAI SDK only when needed
Cache provider resolution

Tier 1 (Medium Effort):

Preload common providers in background
Connection pooling for repeated calls

Tier 2 (Architectural):

Optional “lite” mode without full type checking
Async initialization pipeline

Performance Snapshots

Create baseline snapshots and compare against them to detect regressions:

# Create a baseline snapshot
praisonai profile snapshot --baseline

# Later, compare against baseline
praisonai profile snapshot current --compare

# Save snapshot with custom name
praisonai profile snapshot v2.0

# Get JSON output
praisonai profile snapshot --format json

Output (comparison):

======================================================================
Performance Comparison Report
======================================================================

Baseline: baseline (2025-01-01T00:00:00Z)
Current:  current (2025-01-02T00:00:00Z)

----------------------------------------------------------------------
Metric                    Baseline      Current         Diff        %
----------------------------------------------------------------------
Startup Cold (ms)           100.00       105.00        +5.00     +5.0%
Import Time (ms)            500.00       520.00       +20.00     +4.0%
Query Time (ms)            2000.00      2100.00      +100.00     +5.0%
----------------------------------------------------------------------

✅ No significant regression
======================================================================

Performance Optimizations

Configure opt-in performance optimizations:

# Show current optimization status
praisonai profile optimize --show

# Enable provider pre-warming
praisonai profile optimize --prewarm

# Show lite mode configuration
praisonai profile optimize --lite

Environment Variables

Enable optimizations via environment variables:

Variable	Description
`PRAISONAI_LITE_MODE=1`	Enable lite mode (skip heavy validation)
`PRAISONAI_SKIP_TYPE_VALIDATION=1`	Skip type validation
`PRAISONAI_MINIMAL_IMPORTS=1`	Use minimal imports

Optimization Tiers

Tier 0 (Always Safe):

Provider/model resolution caching
Lazy imports for heavy modules
CLI startup path optimization

Tier 1 (Opt-in):

Connection pooling for repeated API calls
Provider pre-warming (background initialization)

Tier 2 (Opt-in, Architectural):

Lite mode (skip expensive validation)
Performance snapshot baselines

CLI

​Quick Start

​Subcommands

​Query

​Imports

​Startup

​Advanced Usage

​Deep Call Tracing

​Save Artifacts

​JSON Output

​Streaming with First Token Tracking

​Python API

​Use Cases

​Identify Slow Imports

​Optimize Agent Performance

​Debug Latency Issues

​CI/CD Integration

​Safety Notes

​Profile Suite

​Performance Analysis Report

​Observed Timing Breakdown

​Import Time Hotspots

​Function Time Hotspots

​Root Causes

​Optimization Opportunities

​Performance Snapshots

​Performance Optimizations

​Environment Variables

​Optimization Tiers

Quick Start

Subcommands

Query

Imports

Startup

Advanced Usage

Deep Call Tracing

Save Artifacts

JSON Output

Streaming with First Token Tracking

Python API

Use Cases

Identify Slow Imports

Optimize Agent Performance

Debug Latency Issues

CI/CD Integration

Safety Notes

Profile Suite

Performance Analysis Report

Observed Timing Breakdown

Import Time Hotspots

Function Time Hotspots

Root Causes

Optimization Opportunities

Performance Snapshots

Performance Optimizations

Environment Variables

Optimization Tiers