> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mcp-use.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Sampling

> Enable tools to request LLM completions

Sampling enables MCP tools to request LLM completions during their execution. This powerful feature allows servers to dynamically generate content, make decisions, or process data using AI models while executing tool calls.

<Info>
  **What is Sampling?** Sampling is when an MCP server calls back to the client
  requesting an LLM completion. This enables tools to leverage AI capabilities
  without directly integrating with LLM providers.
</Info>

## Understanding Sampling

Sampling creates a callback mechanism where:

1. Client calls a tool on the server
2. Server requests an LLM completion from the client
3. Client executes the LLM call and returns results
4. Server continues execution with the LLM response

```mermaid theme={null}
sequenceDiagram
    participant Client
    participant Server
    participant LLM

    Client->>Server: 1. Call tool
    activate Server
    Server->>Client: 2. Request LLM completion (sampling)
    deactivate Server
    activate Client
    Client->>LLM: 3. Execute LLM call
    activate LLM
    LLM-->>Client: LLM response
    deactivate LLM
    Client-->>Server: 4. Return results
    deactivate Client
    activate Server
    Server->>Server: Continue execution with response
    Server-->>Client: Tool result
    deactivate Server
```

This pattern is useful for:

* Content generation within tools
* Dynamic decision making
* Data transformation and analysis
* Interactive workflows

## Configuration

To enable sampling, provide an `onSampling` function when initializing the MCPClient. You can import `OnSamplingCallback`, `CreateMessageRequestParams`, and `CreateMessageResult` from `mcp-use` so you don't need to import from the MCP SDK.

<Info>
  **Deprecated:** The `samplingCallback` name is still supported for backward
  compatibility but will be removed in a future version. Use `onSampling`
  instead.
</Info>

Minimal example:

```typescript theme={null}
import { MCPClient, type OnSamplingCallback } from "mcp-use";

const onSampling: OnSamplingCallback = async (params) => {
  const lastMessage = params.messages?.[params.messages.length - 1];
  const text = typeof lastMessage?.content === "object" && lastMessage?.content && "text" in lastMessage.content
    ? (lastMessage.content as { text?: string }).text
    : "";
  return {
    role: "assistant",
    content: { type: "text", text: (await yourLLM.complete(text)) ?? "" },
    model: "your-model",
    stopReason: "endTurn",
  };
};

const client = new MCPClient(config, { onSampling });
```

For a full runnable example, see [sampling-client.ts](https://github.com/mcp-use/mcp-use/blob/main/libraries/typescript/packages/mcp-use/examples/client/node/communication/sampling-client.ts).

## Per-Server Sampling Callbacks

When connecting to multiple MCP servers, each server can use a different sampling callback. Define `onSampling` directly in a server's config to override the global default set in the second MCPClient argument.

```typescript theme={null}
import { MCPClient, type OnSamplingCallback } from "mcp-use";

const claudeSampling: OnSamplingCallback = async (params) => {
  const response = await claude.complete(params);
  return { role: "assistant", content: { type: "text", text: response }, model: "claude", stopReason: "endTurn" };
};

const gptSampling: OnSamplingCallback = async (params) => {
  const response = await gpt.complete(params);
  return { role: "assistant", content: { type: "text", text: response }, model: "gpt-4", stopReason: "endTurn" };
};

const fallbackSampling: OnSamplingCallback = async (params) => {
  const response = await defaultLlm.complete(params);
  return { role: "assistant", content: { type: "text", text: response }, model: "default", stopReason: "endTurn" };
};

const client = new MCPClient(
  {
    mcpServers: {
      codeServer: {
        url: "https://code.example.com/mcp",
        onSampling: claudeSampling,
      },
      creativeServer: {
        url: "https://creative.example.com/mcp",
        onSampling: gptSampling,
      },
      utilityServer: {
        url: "https://util.example.com/mcp",
      },
    },
  },
  { onSampling: fallbackSampling }
);
```

**Precedence order** (first match wins):

| Priority | Source                                     | Example                                               |
| -------- | ------------------------------------------ | ----------------------------------------------------- |
| 1        | Per-server `onSampling`                    | `mcpServers.myServer.onSampling`                      |
| 2        | Per-server `samplingCallback` (deprecated) | `mcpServers.myServer.samplingCallback`                |
| 3        | Global `onSampling`                        | Second arg to `new MCPClient(config, { onSampling })` |
| 4        | Global `samplingCallback` (deprecated)     | Second arg `samplingCallback`                         |

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { MCPClient } from 'mcp-use'
  import type { CreateMessageRequestParams, CreateMessageResult } from 'mcp-use'

  async function onSampling(
  params: CreateMessageRequestParams
  ): Promise<CreateMessageResult> {
  // Integrate with your LLM of choice (OpenAI, Anthropic, etc.)
  // Extract the last message content
  const lastMessage = params.messages[params.messages.length - 1]
  const content = Array.isArray(lastMessage.content)
  ? lastMessage.content[0]
  : lastMessage.content

  // Call your LLM with optional parameters
  const response = await yourLlm.complete({
  text: content.text,
  systemPrompt: params.systemPrompt,
  maxTokens: params.maxTokens,
  temperature: params.temperature,
  stopSequences: params.stopSequences,
  })

  return {
  role: 'assistant',
  content: { type: 'text', text: response },
  model: 'your-model-name',
  stopReason: 'endTurn'
  }
  }

  const client = new MCPClient(config, {
  onSampling
  })

  ```
</CodeGroup>

## Type Definitions

### CreateMessageRequestParams

The params object passed to your sampling callback includes:

```typescript theme={null}
interface CreateMessageRequestParams {
  messages: Array<{
    role: 'user' | 'assistant';
    content: {
      type: 'text' | 'image';
      text?: string;
      data?: string;
      mimeType?: string;
    };
  }>;

  // Optional model preferences
  modelPreferences?: {
    hints?: Array<{ name?: string }>;
    costPriority?: number;       // 0.0 to 1.0
    speedPriority?: number;       // 0.0 to 1.0
    intelligencePriority?: number; // 0.0 to 1.0
  };

  // Optional parameters
  systemPrompt?: string;
  maxTokens?: number;
  temperature?: number;
  stopSequences?: string[];
  includeContext?: 'none' | 'thisServer' | 'allServers';
  metadata?: Record<string, unknown>;
}
```

### CreateMessageResult

Your callback must return this structure:

```typescript theme={null}
interface CreateMessageResult {
  role: "assistant";
  content: {
    type: "text" | "image";
    text?: string; // For text content
    data?: string; // For image content (base64)
    mimeType?: string; // For image content
  };
  model: string;
  stopReason?: "endTurn" | "maxTokens" | "stopSequence";
}
```

### Example with Model Preferences

```typescript theme={null}
async function onSampling(
  params: CreateMessageRequestParams
): Promise<CreateMessageResult> {
  // Handle model preferences if provided
  let modelName = "default-model";

  if (params.modelPreferences?.hints?.[0]?.name) {
    modelName = params.modelPreferences.hints[0].name;
  } else if (params.modelPreferences?.intelligencePriority > 0.8) {
    modelName = "gpt-4";
  } else if (params.modelPreferences?.speedPriority > 0.8) {
    modelName = "gpt-5.5-mini";
  }

  const lastMessage = params.messages[params.messages.length - 1];
  const content = Array.isArray(lastMessage.content)
    ? lastMessage.content[0]
    : lastMessage.content;

  const response = await yourLlm.complete({
    model: modelName,
    text: content.text,
    systemPrompt: params.systemPrompt,
    maxTokens: params.maxTokens,
    temperature: params.temperature,
  });

  return {
    role: "assistant",
    content: { type: "text", text: response },
    model: modelName,
    stopReason: "endTurn",
  };
}
```

<Tip>
  To check how to create sampling-enabled tools, see the [Server Tools
  Guide](../server/sampling).
</Tip>

For runnable client examples, see the [client examples](https://github.com/mcp-use/mcp-use/tree/main/libraries/typescript/packages/mcp-use/examples/client) (e.g. `node/communication/sampling-client.ts`, `node/full-features-example.ts`).
