LLM Fixtures

Configure match rules and responses for your LLM mock APIs. Define fixtures to control exactly how your mock responds to different chat completion requests.

How Fixtures Work

Fixtures give you precise control over how your LLM mock API responds to incoming chat completion requests.

Evaluation Process

1

Ordered Fixture List

Each LLM API has an ordered list of fixtures, sorted by priority number (lower number = higher priority). Disabled fixtures are skipped during evaluation.

2

Evaluate Match Rules

When a request comes in, fixtures are evaluated in priority order. All specified match fields on a fixture must pass (AND logic).

3

First Match Wins

The first fixture where all match rules pass returns its configured response

4

Fallback Response

If no fixture matches, a default 404-style response is returned indicating no matching fixture was found

Match Rules

Each fixture can specify one or more match rules. Only the fields you define are checked — omitted fields are ignored (wildcard behavior).

User Message
userMessage

Matches against the last user message in the conversation. Supports substring matching (case-insensitive) and regular expressions.

Substring Match (case-insensitive)

helloMatches "Hello, how are you?", "Say hello world", etc.

Regex Pattern

/^tell me about .+$/Matches "tell me about dogs", "tell me about quantum physics"

Model
model

Matches against the model name in the request. Supports case-insensitive substring matching, glob patterns, and regex.

Substring Match (case-insensitive)

my-modelMatches "my-model", "my-model-v2", "my-model-fast"

Glob Pattern

my-model*Matches "my-model", "my-model-v2", "my-model-fast", "my-model-mini"

System Prompt
systemPrompt

Matches against the system message in the request. Like userMessage, it supports substring matching (case-insensitive) and regex.

Substring Match

you are a helpful assistantMatches any system prompt containing this phrase

Regex Pattern

/^You are a (customer support|sales) agent$/Matches specific role-based system prompts

Tool Name
toolName

Matches if any tool definition in the request has this name. Case-insensitive, supports substring, glob, and regex.

Substring Match (case-insensitive)

get_weatherMatches when the request includes a tool named "get_weather"

Temperature
temperature

Exact float match against the temperature value in the request.

Exact Float Match

0.7Matches only when temperature is exactly 0.7

Custom
custom

JSONPath expression evaluated against the raw request body. Advanced usage for matching any field not covered by the built-in rules.

JSONPath Expression

$.messages[0].role == "system"Match when the first message has a system role

Response Types

Each fixture defines a response that is returned when all match rules pass. Three response types are supported.

Text Response

Return a plain text response as the assistant message content. Configure the finish reason and optionally set token usage statistics.

Options:
content
finishReason: stop | length | tool_calls | content_filter
usage.promptTokens
usage.completionTokens
{
  "id": "chatcmpl-mock-abc123",
  "object": "chat.completion",
  "model": "your-model",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Tool Calls Response

Return a tool call response. Define one or more tool calls, each with a name and arguments (JSON string). Set finishReason to tool_calls.

Options:
toolCalls[].name
toolCalls[].arguments (JSON string)
{
  "id": "chatcmpl-mock-def456",
  "object": "chat.completion",
  "model": "your-model",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{"location":"San Francisco","unit":"celsius"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 48,
    "completion_tokens": 24,
    "total_tokens": 72
  }
}

Error Response

Return an error response with a specific HTTP status code, error message, and error type. Useful for testing how your application handles LLM API errors.

Options:
error.status
error.message
error.type
// HTTP status: 429
{
  "error": {
    "message": "Rate limit exceeded. Please retry after 20s.",
    "type": "tokens"
  }
}

Priority & Ordering

Control the evaluation order of your fixtures to ensure the correct response is returned for each request.

How Priority Works

  1. 1

    Lowest Number First

    Fixtures are sorted by priority number in ascending order. A fixture with priority 1 is evaluated before priority 10.

  2. 2

    Reordering

    Reorder fixtures via the API or fixture editor. Priority numbers are automatically reassigned.

  3. 3

    Specific Before General

    Place fixtures with specific match rules at high priority (low numbers), and catch-all fixtures at low priority (high numbers).

Protocol

Each fixture specifies which transport it applies to. This controls whether the fixture is evaluated for HTTP/SSE requests, WebSocket connections, or both.

Protocol Options

http
Fixture is only evaluated for HTTP and SSE requests
ws
Fixture is only evaluated for WebSocket connections
both
Fixture is evaluated for all transport types (default)

Example Configurations

Practical fixture configurations for common LLM mocking scenarios

Greeting Handler
Priority: 1

Matches any user message containing "hello" and returns a friendly greeting.

Match Rules

userMessage: hello

Response (Text)

{
  "content": "Hello! I'm your AI assistant. How can I help you today?",
  "finishReason": "stop"
}

Weather Tool
Priority: 2

Matches requests that include a get_weather tool definition and returns a tool call response.

Match Rules

toolName: get_weather

Response (Tool Call)

{
  "toolCallName": "get_weather",
  "toolCallArguments": "{"location":"San Francisco","unit":"celsius"}"
}

Rate Limit Error
Priority: 3

Matches requests using a GPT-4 model with temperature 0 and returns a 429 rate limit error. Useful for testing error handling and retry logic.

Match Rules

model: my-model*
temperature: 0

Response (Error)

{
  "statusCode": 429,
  "errorMessage": "Rate limit exceeded. Please retry after 20s.",
  "errorType": "rate_limit_exceeded"
}