> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sudoapp.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Responses API

> OpenAI Responses API via Sudo

Sudo exposes the [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses/create) at the `v1/responses` endpoint.\
If you're already using OpenAI's Responses API you can usually **swap the base URL, provide a Sudo API key, and you're done**.

**Base URL**: `https://sudoapp.dev/api`\
**Responses endpoint**: `/v1/responses`

<Note>
  This interface **only supports OpenAI models** (for example `gpt-4.1`, `gpt-4.1-mini`, `o3-mini`, etc.).\
  Sudo **forwards requests directly to OpenAI's Responses API** with one-to-one parity, which makes it ideal for **agent-style workflows, tools-heavy orchestration, and modern OpenAI-first features** that go beyond classic chat completions.
</Note>

As with `/v1/chat/completions`, **rate limiting and error handling are managed by Sudo's AI router** – behavior is identical to the canonical `/v1/chat/completions` endpoint.

***

## Requests

You `POST` to `/v1/responses` with the same shape OpenAI expects.\
Below is an **example** condensed TypeScript definition (non-exhaustive) that highlights the most commonly used fields:

<CodeGroup>
  ```typescript title="Request Schema" theme={null}
  export type ResponsesRequest = {
    /** Required: OpenAI model identifier, e.g. `gpt-4.1-mini` */
    model: string;

    /**
     * Core input to the model.
     * Can be a simple string, or an array of structured items (messages, tool calls, etc.).
     */
    input?: ResponsesInput;

    /** High-level instructions / system prompt for the model. */
    instructions?: string | ResponseInstructionItem[];

    /** Enable or disable automatic tool calling. */
    tool_choice?: 'none' | 'auto' | 'required' | ToolChoiceObject;

    /** Tools (functions, file search, code interpreter, etc.) available to the model. */
    tools?: Tool[];

    /** Common generation params */
    temperature?: number;
    top_p?: number;
    max_output_tokens?: number;

    /** Agent / orchestration features */
    reasoning?: unknown;
    service_tier?: 'auto' | 'default' | string;
    metadata?: Record<string, unknown>;
    store?: boolean;
    conversation?: string | ConversationConfig;
    previous_response_id?: string;

    /** Streaming */
    stream?: boolean;
    stream_options?: unknown;

    /** Other advanced options from OpenAI are also supported 1:1 */
    truncation?: string;
    max_tool_calls?: number;
    top_logprobs?: number;
    prompt_cache_key?: string;
  };

  export type ResponsesInput =
    | string
    | InputItem[];

  export type InputItem =
    | {
        role: 'user' | 'assistant' | 'system';
        content: MessageContent;
      }
    | StructuredItem
    | ItemReference;

  export type MessageContent =
    | string
    | Array<
        | { type: 'input_text'; text: string }
        | {
            type: 'input_image';
            image_url?: string;
            file_id?: string;
            detail?: 'low' | 'high' | 'auto';
          }
        | {
            type: 'input_file';
            file_id?: string;
            file_url?: string;
            filename?: string;
          }
        | {
            type: 'input_audio';
            input_audio: { data: string; format: string };
          }
      >;
  ```
</CodeGroup>

For a fully detailed schema (including the richer `Item` types for tools, MCP, computer use, etc.), see the official [OpenAI Responses API reference](https://platform.openai.com/docs/api-reference/responses/create).

### Headers

Only two headers are required:

| Header          | Value                   |
| --------------- | ----------------------- |
| `Authorization` | `Bearer <SUDO_API_KEY>` |
| `Content-Type`  | `application/json`      |

### Minimal JSON example

The smallest useful request can be just a model and a string input:

```json title="Minimal Request" theme={null}
{
  "model": "gpt-4.1-mini",
  "input": "Say hello from the Sudo Responses API."
}
```

For richer agent-style prompts you typically send an array of input items:

```json title="Structured Input Request" theme={null}
{
  "model": "gpt-4.1-mini",
  "input": [
    {
      "role": "system",
      "content": [
        {
          "type": "input_text",
          "text": "You are an assistant that answers in JSON."
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Return a todo list for moving apartments."
        }
      ]
    }
  ],
  "tool_choice": "auto",
  "tools": [
    {
      "type": "function",
      "name": "create_task",
      "description": "Create a task in the user's task manager",
      "parameters": {
        "type": "object",
        "properties": {
          "title": { "type": "string" },
          "due_date": { "type": "string", "format": "date-time" }
        },
        "required": ["title"]
      }
    }
  ]
}
```

### Example: non‑streaming `curl`

```shell title="curl" theme={null}
curl https://sudoapp.dev/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SUDO_API_KEY" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": "Say hello from the Sudo Responses API."
  }'
```

***

## Responses

Sudo **returns the same schema as the OpenAI Responses API**, including the `response` object and `output` items.

```typescript title="Response Schema (simplified)" theme={null}
export type Response = {
  id: string;
  object: 'response';
  created_at: number;
  model: string;

  /** Overall status of the response */
  status:
    | 'in_progress'
    | 'completed'
    | 'failed'
    | 'cancelled'
    | 'incomplete';

  /** Generated output items (messages, tool calls, reasoning, etc.) */
  output?: Item[];

  /** Conversation information (if stored) */
  conversation?: { id: string };

  /** Token usage */
  usage?: {
    input_tokens: number;
    output_tokens: number;
    total_tokens: number;
  };

  /** Optional configuration echoes and metadata */
  metadata?: Record<string, unknown>;
  background?: boolean;
  instructions?: string | unknown[];
  max_output_tokens?: number;
  max_tool_calls?: number;
  parallel_tool_calls?: boolean;
  previous_response_id?: string;
  prompt?: unknown;
  prompt_cache_key?: string;
  reasoning?: unknown;
  safety_identifier?: string;
  service_tier?: string;
  temperature?: number;
  text?: unknown;
  tool_choice?: unknown;
  tools?: unknown[];
  truncation?: string;
  top_p?: number;
  top_logprobs?: number;
};

export type Item =
  | OutputMessageItem
  | ToolCallItem
  | ToolCallOutputItem
  | ReasoningItem
  | FileSearchItem
  | CodeInterpreterItem
  | MCPItem
  | CustomToolItem;

export type OutputMessageItem = {
  id: string;
  type: 'output_message';
  role: 'assistant';
  status: string;
  content: Array<
    | {
        type: 'output_text';
        text: string;
        annotations: unknown[];
        logprobs?: unknown[];
      }
    | { type: 'refusal'; refusal: string }
  >;
};
```

### Example response

```json title="Example Response" theme={null}
{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1710000000,
  "model": "gpt-4.1-mini",
  "status": "completed",
  "output": [
    {
      "id": "msg_1",
      "type": "output_message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Hello from the Sudo Responses API!",
          "annotations": [],
          "logprobs": null
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 8,
    "total_tokens": 20
  }
}
```

### Errors

Error envelopes are **identical** to the rest of the Sudo AI router (including `/v1/chat/completions`), for example:

```json title="Error" theme={null}
{
  "error": {
    "message": "Invalid API key",
    "type": "api_error"
  }
}
```

You can expect OpenAI-originated errors (for example invalid model, bad tools schema) to be surfaced through the same structure with appropriate HTTP status codes.

***

## Streaming

When you set `"stream": true` in the request body, Sudo will stream **Server‑Sent Events (SSE)** that mirror the [OpenAI Responses streaming format](https://platform.openai.com/docs/api-reference/responses-streaming/response).

Unlike `/v1/chat/completions` streaming, which sends **delta chunks for `choices[].delta`**, the Responses API streams **typed events** such as:

* **`response.created`**: the response object has been created.
* **`response.output_text.delta`**: incremental text tokens being generated.
* **`response.output_text.done`**: the final text segment for an output item.
* **`response.completed`**: the entire response has finished.
* **`error`**: an error occurred while generating the response.

Each SSE event is a JSON object with a top-level `type` field. A typical stream might start like:

```text title="SSE events (truncated)" theme={null}
data: {"type":"response.created","response":{"id":"resp_abc123","status":"in_progress", ...}}

data: {
  "type": "response.output_text.delta",
  "delta": {
    "content": [
      {
        "type": "output_text_delta",
        "text": { "value": "Hello from the " }
      }
    ]
  }
}

data: {"type":"response.output_text.done", ...}

data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed", ...}}

data: [DONE]
```

### Example: streaming `curl`

```shell title="curl (streaming)" theme={null}
curl https://sudoapp.dev/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SUDO_API_KEY" \
  -N \
  -d '{
    "model": "gpt-4.1-mini",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_text",
            "text": "Write a short haiku about the Sudo Responses API."
          }
        ]
      }
    ],
    "stream": true
  }'
```

The SSE stream can be consumed using any EventSource/SSE client or your own HTTP streaming logic.

***

## Rate limiting & behavior

* **Rate limits**: The Responses API is subject to the same Sudo-side rate limiting as `/v1/chat/completions` (per‑app, per‑developer, and provider‑level throttling).
* **Error handling**: Errors are normalized and returned using the same `ErrorResponse` shape as the rest of the AI router.

Because Sudo simply forwards to OpenAI's Responses API under the hood, you automatically benefit from the **latest OpenAI agent and orchestration features**, while keeping a single Sudo integration (API key, billing, and monitoring) across all your AI usage.