Skip to main content
The profile command provides detailed cProfile-based profiling for query execution, showing per-function and per-file timing, call graphs, and latency metrics.

Quick Start

# Profile a query with detailed timing breakdown
praisonai profile query "What is 2+2?"

# Profile with file grouping
praisonai profile query "Hello" --show-files --limit 20

# Profile startup time
praisonai profile startup

# Profile import times
praisonai profile imports

Subcommands

Query

Profile a query execution with detailed timing breakdown.
praisonai profile query "Your prompt here"
Options:
OptionDescription
--model, -mModel to use
--stream/--no-streamUse streaming mode
--deepEnable deep call tracing (higher overhead)
--limit, -nTop N functions to show (default: 30)
--sort, -sSort by: cumulative or tottime
--show-filesGroup timing by file/module
--show-callersShow caller functions
--show-calleesShow callee functions
--importtimeShow module import times
--first-tokenTrack time to first token (streaming)
--saveSave artifacts to path (.prof, .txt)
--format, -fOutput format: text or json
Example:
praisonai profile query "Write a poem about AI" --show-files --limit 15
Output:
======================================================================
PraisonAI Profile Report
======================================================================

## System Information
  Timestamp:        2025-12-31T17:37:46.662247Z
  Python Version:   3.12.11
  Platform:         macOS-15.7.4-arm64-arm-64bit
  PraisonAI:        2.9.2
  Model:            default

## Timing Breakdown
  CLI Parse:              0.00 ms
  Imports:              867.21 ms
  Agent Construct:        0.06 ms
  Model Init:             0.00 ms
  Total Run:           2302.64 ms

## Per-Function Timing (Top Functions)
----------------------------------------------------------------------
Function                               Calls Cumulative (ms)    Self (ms)
----------------------------------------------------------------------
start                                      1      2302.57         0.03
chat                                       1      2302.54         0.03
_chat_completion                           1      2302.45         0.02
...
======================================================================

Imports

Profile module import times to identify slow imports.
praisonai profile imports
Output:
======================================================================
Import Time Analysis
======================================================================
Module                                        Self (μs)  Cumul (μs)
----------------------------------------------------------------------
praisonaiagents                                   12345      123456
praisonaiagents.agent                              5432       98765
...
----------------------------------------------------------------------
Total import time: 123.45 ms

Startup

Profile CLI startup time.
praisonai profile startup
Output:
==================================================
Startup Time Analysis
==================================================
Cold Start:       60.81 ms
Warm Start:       61.40 ms
==================================================

Advanced Usage

Deep Call Tracing

Enable deep call tracing for detailed call graph analysis:
praisonai profile query "Test" --deep --show-callers --show-callees
Deep call tracing adds significant overhead. Use only for detailed debugging.

Save Artifacts

Save profiling artifacts for later analysis:
praisonai profile query "Test" --save ./profile_results
This creates:
  • profile_results.prof - Binary cProfile data (can be loaded with pstats)
  • profile_results.txt - Human-readable report

JSON Output

Get machine-readable output for CI/CD integration:
praisonai profile query "Test" --format json > profile.json

Streaming with First Token Tracking

Track time to first token in streaming mode:
praisonai profile query "Test" --stream --first-token

Python API

You can also use the profiler programmatically:
from praisonai.cli.features.profiler import (
    ProfilerConfig,
    QueryProfiler,
    format_profile_report,
)

# Configure profiler
config = ProfilerConfig(
    deep=False,
    limit=20,
    show_files=True,
)

# Run profiled query
profiler = QueryProfiler(config)
result = profiler.profile_query("What is 2+2?", model="gpt-4o-mini")

# Print report
print(format_profile_report(result, config))

# Access timing data
print(f"Total time: {result.timing.total_run_ms:.2f} ms")
print(f"Imports: {result.timing.imports_ms:.2f} ms")

Use Cases

Identify Slow Imports

# Find which modules are slowing down startup
praisonai profile imports

Optimize Agent Performance

# Profile with file grouping to find hotspots
praisonai profile query "Complex task" --show-files --limit 30

Debug Latency Issues

# Track time to first token for streaming
praisonai profile query "Test" --stream --first-token

CI/CD Integration

# Export JSON for automated analysis
praisonai profile query "Test" --format json --save ./ci_profile

Safety Notes

  • Secrets are redacted from profiling output (API keys, tokens)
  • Deep tracing is opt-in due to overhead
  • No prompt logging unless explicitly saved
  • Safe by default - minimal overhead in normal mode

Profile Suite

Run comprehensive profiling across multiple scenarios:
# Full suite (4 scenarios, 3 iterations each)
praisonai profile suite

# Quick mode (2 scenarios, 1 iteration)
praisonai profile suite --quick

# Custom output directory
praisonai profile suite --output ./my_profile_results

# More iterations for statistical significance
praisonai profile suite --iterations 5
Output Files:
  • suite_results.json - Machine-readable JSON with all timing data
  • suite_report.txt - Human-readable summary report
Scenarios Tested:
  • simple_non_stream - Simple prompt, non-streaming
  • simple_stream - Simple prompt, streaming
  • medium_non_stream - Medium prompt, non-streaming
  • medium_stream - Medium prompt, streaming

Performance Analysis Report

Observed Timing Breakdown

Based on profiling runs, here’s where time is spent:
PhaseTime (ms)% of Total
CLI Startup60-4001-8%
Import praisonaiagents1300-280025-55%
Agent Construction0.1-10.1%
Model API Call2000-500040-70%
Total2300-7000100%

Import Time Hotspots

Top modules by import time:
ModuleTime (ms)Notes
praisonaiagents2700-3500Root import
openai1300-1400OpenAI SDK
openai.types1100-1200Type definitions
openai.types.batch600-700Batch types
openai._models250-650Pydantic models

Function Time Hotspots

Top functions by cumulative time:
FunctionFileTime (ms)
startagent.py2300-6900
chatagent.py2300-6900
_chat_completionagent.py2300-6900
createcompletions.py2000-4200
send_client.py2000-4200

Root Causes

  1. Heavy OpenAI SDK imports (~1.3s)
    • Pydantic model validation at import time
    • Type definitions loaded eagerly
  2. Network latency (~2-5s)
    • API round-trip dominates total time
    • Cannot be optimized locally
  3. Streaming vs Non-streaming
    • Streaming shows faster time-to-first-token
    • Total time similar or slightly better

Optimization Opportunities

Tier 0 (Safe, Fast Wins):
  • Lazy import OpenAI SDK only when needed
  • Cache provider resolution
Tier 1 (Medium Effort):
  • Preload common providers in background
  • Connection pooling for repeated calls
Tier 2 (Architectural):
  • Optional “lite” mode without full type checking
  • Async initialization pipeline

Performance Snapshots

Create baseline snapshots and compare against them to detect regressions:
# Create a baseline snapshot
praisonai profile snapshot --baseline

# Later, compare against baseline
praisonai profile snapshot current --compare

# Save snapshot with custom name
praisonai profile snapshot v2.0

# Get JSON output
praisonai profile snapshot --format json
Output (comparison):
======================================================================
Performance Comparison Report
======================================================================

Baseline: baseline (2025-01-01T00:00:00Z)
Current:  current (2025-01-02T00:00:00Z)

----------------------------------------------------------------------
Metric                    Baseline      Current         Diff        %
----------------------------------------------------------------------
Startup Cold (ms)           100.00       105.00        +5.00     +5.0%
Import Time (ms)            500.00       520.00       +20.00     +4.0%
Query Time (ms)            2000.00      2100.00      +100.00     +5.0%
----------------------------------------------------------------------

✅ No significant regression
======================================================================

Performance Optimizations

Configure opt-in performance optimizations:
# Show current optimization status
praisonai profile optimize --show

# Enable provider pre-warming
praisonai profile optimize --prewarm

# Show lite mode configuration
praisonai profile optimize --lite

Environment Variables

Enable optimizations via environment variables:
VariableDescription
PRAISONAI_LITE_MODE=1Enable lite mode (skip heavy validation)
PRAISONAI_SKIP_TYPE_VALIDATION=1Skip type validation
PRAISONAI_MINIMAL_IMPORTS=1Use minimal imports

Optimization Tiers

Tier 0 (Always Safe):
  • Provider/model resolution caching
  • Lazy imports for heavy modules
  • CLI startup path optimization
Tier 1 (Opt-in):
  • Connection pooling for repeated API calls
  • Provider pre-warming (background initialization)
Tier 2 (Opt-in, Architectural):
  • Lite mode (skip expensive validation)
  • Performance snapshot baselines