LLM Mocking
Mock AI and LLM API endpoints using the OpenAI-compatible format. Test chat completions, streaming responses, and WebSocket connections without spending a single token.
What is LLM Mocking?
Simulate AI model responses for deterministic, cost-free development and testing.
dotMock can mock LLM and AI model APIs using the OpenAI-compatible format. Instead of calling a real model, you define fixtures -- each fixture contains match rules (conditions that incoming requests are tested against) and a response (the mock output to return when matched).
When a request arrives, the mock server evaluates your fixtures in order and returns the response from the first match. This first-match-wins approach gives you full control over which responses are served for specific prompts, models, or parameters.
Example: A fixture matching model: "your-model" with a message containing "Hello" returns a predefined assistant response -- instantly, with zero latency and zero cost.
Supported Endpoints
Four endpoints covering the full range of OpenAI-compatible interactions.
Chat Completions
POST /v1/chat/completionsOpenAI Chat Completions API. Supports both synchronous JSON responses and server-sent event streaming.
Responses API
POST /v1/responsesOpenAI Responses API for structured output. Returns complete response objects with streaming support.
Responses WebSocket
WS /v1/responsesWebSocket transport for the Responses API. Push events to connected clients in real time.
Bidirectional Chat
WS /v1/chatFull-duplex WebSocket chat. Send and receive messages over a persistent connection for real-time AI interactions.
Getting Started
Create your first LLM mock API in four steps.
Click "New API" and select "LLM API"
Choose the LLM API type from the creation dialog to get pre-configured OpenAI-compatible endpoints.
Enter a name and subdomain
Pick a descriptive name and a unique subdomain. Your mock will be available at your-subdomain.mock.rest.
Create your first fixture with match rules
Define which requests to intercept (by model, message content, or other fields) and what response to return.
Test with curl or the built-in test panel
Send a request to your mock endpoint and verify the response matches your fixture configuration.
Quick Test
Try your LLM mock with these curl examples.
Non-Streaming Request
curl -X POST https://your-api.mock.rest/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"messages": [
{"role": "user", "content": "Hello"}
]
}'Streaming Request (SSE)
curl -X POST https://your-api.mock.rest/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"stream": true,
"messages": [
{"role": "user", "content": "Hello"}
]
}'How It Works
The request flow from client to mock response.
Use Cases
Common scenarios where LLM mocking accelerates development.
Frontend Development
Build and iterate on AI-powered UIs without real API costs. Get instant, predictable responses while designing chat interfaces, streaming indicators, and error states.
CI/CD Testing
Run integration tests with deterministic LLM responses. Eliminate flaky tests caused by non-deterministic model output and avoid per-token charges in your test pipeline.
Demo & Prototyping
Showcase AI features in demos and prototypes before your backend integration is complete. Control exactly what the "AI" says for a polished, repeatable presentation.
Error Simulation
Test how your application handles timeouts, rate limits, 429 errors, malformed responses, and partial streaming failures -- all without waiting for real failures to occur.
Tips
Drop-in replacement: Point your OpenAI SDK base URL to your dotMock subdomain -- no other code changes needed.
Fixture ordering matters: The mock server checks fixtures top-to-bottom and returns the first match. Put more specific rules above general catch-all rules.
Streaming fidelity: SSE mock responses use the same chunked event format as real OpenAI streams, so your client-side parsing code works unchanged.
WebSocket testing: Use the built-in test panel to send WebSocket frames and verify the event sequence your client will receive.