MCP Sampling

Sampling allows MCP servers to request LLM completions from clients. This enables servers to leverage the client’s LLM capabilities for text generation, with support for tool calling.

Protocol Version

This feature implements MCP Protocol Version 2025-11-25.

Tool Choice Modes

Mode	Description
`auto`	Model decides whether to use tools
`none`	Model should not use tools
`any`	Model must use at least one tool
`tool`	Model must use a specific tool

Python API

Basic Sampling

import asyncio
from praisonai.mcp_server.sampling import (
    SamplingHandler,
    SamplingRequest,
    SamplingMessage,
    create_sampling_request,
)

async def main():
    handler = SamplingHandler(default_model="gpt-4o-mini")
    
    # Create simple request
    request = create_sampling_request(
        prompt="What is the capital of France?",
        system_prompt="You are a helpful geography assistant.",
        max_tokens=100,
    )
    
    response = await handler.create_message(request)
    print(f"Response: {response.content}")
    print(f"Model: {response.model}")

asyncio.run(main())

Sampling with Tools

from praisonai.mcp_server.sampling import ToolChoice, ToolDefinition

request = create_sampling_request(
    prompt="Search for the latest AI news",
    tools=[{
        "name": "web_search",
        "description": "Search the web",
        "inputSchema": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }],
    tool_choice="auto",  # or "none", "any", or specific tool name
)

response = await handler.create_message(request)
if response.tool_calls:
    print(f"Tool calls: {response.tool_calls}")

Tool Choice Factory Methods

from praisonai.mcp_server.sampling import ToolChoice

# Model decides
tc = ToolChoice.auto()
print(tc.to_dict())  # {"mode": "auto"}

# No tools
tc = ToolChoice.none()
print(tc.to_dict())  # {"mode": "none"}

# Must use any tool
tc = ToolChoice.any()
print(tc.to_dict())  # {"mode": "any"}

# Must use specific tool
tc = ToolChoice.tool("web_search")
print(tc.to_dict())  # {"mode": "tool", "name": "web_search"}

Model Preferences

from praisonai.mcp_server.sampling import ModelPreferences, SamplingRequest

prefs = ModelPreferences(
    hints=[{"name": "claude-3-sonnet"}, {"name": "gpt-4"}],
    cost_priority=0.3,      # 0-1, lower = prefer cheaper
    speed_priority=0.5,     # 0-1, lower = prefer faster
    intelligence_priority=0.8,  # 0-1, lower = prefer smarter
)

request = SamplingRequest(
    messages=[SamplingMessage(role="user", content="Hello!")],
    model_preferences=prefs,
    max_tokens=500,
)

Custom Callback

async def my_llm_callback(request):
    """Custom LLM integration."""
    # Call your LLM here
    return SamplingResponse(
        role="assistant",
        content="Custom response",
        model="my-model",
        stop_reason="end_turn",
    )

handler = SamplingHandler()
handler.set_callback(my_llm_callback)

MCP Protocol Messages

Sampling Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {"type": "text", "text": "Hello!"}
      }
    ],
    "maxTokens": 100,
    "systemPrompt": "You are helpful.",
    "modelPreferences": {
      "hints": [{"name": "claude-3-sonnet"}],
      "costPriority": 0.3,
      "speedPriority": 0.5,
      "intelligencePriority": 0.8
    },
    "tools": [
      {
        "name": "search",
        "description": "Search the web",
        "inputSchema": {"type": "object"}
      }
    ],
    "toolChoice": {"mode": "auto"}
  }
}

Sampling Response

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "role": "assistant",
    "content": {"type": "text", "text": "Hello! How can I help?"},
    "model": "claude-3-sonnet",
    "stopReason": "end_turn"
  }
}

Response with Tool Use

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "role": "assistant",
    "content": {"type": "text", "text": ""},
    "model": "claude-3-sonnet",
    "stopReason": "toolUse",
    "toolCalls": [
      {
        "id": "call_123",
        "name": "search",
        "arguments": {"query": "AI news"}
      }
    ]
  }
}

Stop Reasons

Reason	Description
`end_turn`	Model finished naturally
`max_tokens`	Hit token limit
`toolUse`	Model wants to use a tool
`error`	An error occurred

Best Practices

Set appropriate max_tokens - Avoid unnecessary token usage
Use model preferences - Guide model selection
Handle tool calls - Process and respond to tool use
Provide system prompts - Set context for better responses

MCP Tasks API - Long-running operations
MCP Elicitation - User input
PraisonAI MCP Server - Full documentation

​MCP Sampling

​Protocol Version

​Tool Choice Modes

​Python API

​Basic Sampling

​Sampling with Tools

​Tool Choice Factory Methods

​Model Preferences

​Custom Callback

​MCP Protocol Messages

​Sampling Request

​Sampling Response

​Response with Tool Use

​Stop Reasons

​Best Practices

​Related