Skip to content

Sampling

Sampling lets your tool handler request LLM inference from the client. The client decides which model to use — your server provides the prompt.

When to Use

Your MCP server doesn't have its own model access, but a tool needs LLM reasoning:

  • Sentiment analysis on user-provided text
  • Summarizing long content before returning it
  • Classifying or categorizing data
  • Generating descriptions from structured data

The key point: the client controls the model. Your model option is a hint, not a command.

Simple API

c.sample(prompt, options?) sends text, gets text back.

ts
server.tool(
  "sentiment",
  {
    description: "Analyze text sentiment",
    input: z.object({ text: z.string() }),
  },
  async (args, c) => {
    const sentiment = await c.sample(args.text, {
      system: "Respond with exactly one word: positive, negative, or neutral.",
      maxTokens: 10,
    });
    return c.text(`Sentiment: ${sentiment}`);
  },
);

Options

OptionTypeDefaultDescription
maxTokensnumber1024Maximum tokens in the response
modelstringModel hint (client decides)
systemstringSystem prompt
temperaturenumberSampling temperature
stopSequencesstring[]Stop sequences

Raw API

c.sample.raw(params) gives full control over the SDK's CreateMessageRequestParams. Use it for multi-turn messages, image content, or when you need the full response metadata.

ts
server.tool(
  "analyze",
  { description: "Analyze with context" },
  async (_args, c) => {
    const result = await c.sample.raw({
      messages: [
        { role: "user", content: { type: "text", text: "Summarize this data" } },
        { role: "assistant", content: { type: "text", text: "I need more context." } },
        { role: "user", content: { type: "text", text: "Here is the full dataset..." } },
      ],
      maxTokens: 256,
    });

    // result.content, result.model, result.stopReason
    return c.text(result.content.type === "text" ? result.content.text : "");
  },
);

Simple vs Raw

c.sample()c.sample.raw()
InputString promptFull SDK CreateMessageRequestParamsBase
Outputstring (text content)CreateMessageResult (full response)
Multi-turnNoYes
Image contentNoYes
Response metadataNoYes (model, stopReason)

Availability

Handlerc.sample
Tool handlerYes
Task handlerYes
Resource handlerNo

Resources are read-only data lookups — no interactive capabilities like sampling or elicitation.

Under the hood

c.sample() wraps the SDK's server.createMessage(). It constructs a single user message from your prompt string and applies your options (system prompt, temperature, etc.). If the response content type is not text, it returns an empty string. c.sample.raw() passes your params directly to the SDK with no transformation.

What's Next