Responses API - Sudo Developer Platform

Sudo exposes the OpenAI Responses API at the v1/responses endpoint.
If you’re already using OpenAI’s Responses API you can usually swap the base URL, provide a Sudo API key, and you’re done. Base URL: https://sudoapp.dev/api
Responses endpoint: /v1/responses

This interface only supports OpenAI models (for example gpt-4.1, gpt-4.1-mini, o3-mini, etc.).
Sudo forwards requests directly to OpenAI’s Responses API with one-to-one parity, which makes it ideal for agent-style workflows, tools-heavy orchestration, and modern OpenAI-first features that go beyond classic chat completions.

As with /v1/chat/completions, rate limiting and error handling are managed by Sudo’s AI router – behavior is identical to the canonical /v1/chat/completions endpoint.

Requests

You POST to /v1/responses with the same shape OpenAI expects.
Below is an example condensed TypeScript definition (non-exhaustive) that highlights the most commonly used fields:

export type ResponsesRequest = {
  /** Required: OpenAI model identifier, e.g. `gpt-4.1-mini` */
  model: string;

  /**
   * Core input to the model.
   * Can be a simple string, or an array of structured items (messages, tool calls, etc.).
   */
  input?: ResponsesInput;

  /** High-level instructions / system prompt for the model. */
  instructions?: string | ResponseInstructionItem[];

  /** Enable or disable automatic tool calling. */
  tool_choice?: 'none' | 'auto' | 'required' | ToolChoiceObject;

  /** Tools (functions, file search, code interpreter, etc.) available to the model. */
  tools?: Tool[];

  /** Common generation params */
  temperature?: number;
  top_p?: number;
  max_output_tokens?: number;

  /** Agent / orchestration features */
  reasoning?: unknown;
  service_tier?: 'auto' | 'default' | string;
  metadata?: Record<string, unknown>;
  store?: boolean;
  conversation?: string | ConversationConfig;
  previous_response_id?: string;

  /** Streaming */
  stream?: boolean;
  stream_options?: unknown;

  /** Other advanced options from OpenAI are also supported 1:1 */
  truncation?: string;
  max_tool_calls?: number;
  top_logprobs?: number;
  prompt_cache_key?: string;
};

export type ResponsesInput =
  | string
  | InputItem[];

export type InputItem =
  | {
      role: 'user' | 'assistant' | 'system';
      content: MessageContent;
    }
  | StructuredItem
  | ItemReference;

export type MessageContent =
  | string
  | Array<
      | { type: 'input_text'; text: string }
      | {
          type: 'input_image';
          image_url?: string;
          file_id?: string;
          detail?: 'low' | 'high' | 'auto';
        }
      | {
          type: 'input_file';
          file_id?: string;
          file_url?: string;
          filename?: string;
        }
      | {
          type: 'input_audio';
          input_audio: { data: string; format: string };
        }
    >;

For a fully detailed schema (including the richer Item types for tools, MCP, computer use, etc.), see the official OpenAI Responses API reference.

Headers

Only two headers are required:

Header	Value
`Authorization`	`Bearer <SUDO_API_KEY>`
`Content-Type`	`application/json`

Minimal JSON example

The smallest useful request can be just a model and a string input:

Minimal Request

{
  "model": "gpt-4.1-mini",
  "input": "Say hello from the Sudo Responses API."
}

For richer agent-style prompts you typically send an array of input items:

Structured Input Request

{
  "model": "gpt-4.1-mini",
  "input": [
    {
      "role": "system",
      "content": [
        {
          "type": "input_text",
          "text": "You are an assistant that answers in JSON."
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Return a todo list for moving apartments."
        }
      ]
    }
  ],
  "tool_choice": "auto",
  "tools": [
    {
      "type": "function",
      "name": "create_task",
      "description": "Create a task in the user's task manager",
      "parameters": {
        "type": "object",
        "properties": {
          "title": { "type": "string" },
          "due_date": { "type": "string", "format": "date-time" }
        },
        "required": ["title"]
      }
    }
  ]
}

Example: non‑streaming `curl`

curl

curl https://sudoapp.dev/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SUDO_API_KEY" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": "Say hello from the Sudo Responses API."
  }'

Responses

Sudo returns the same schema as the OpenAI Responses API, including the response object and output items.

Response Schema (simplified)

export type Response = {
  id: string;
  object: 'response';
  created_at: number;
  model: string;

  /** Overall status of the response */
  status:
    | 'in_progress'
    | 'completed'
    | 'failed'
    | 'cancelled'
    | 'incomplete';

  /** Generated output items (messages, tool calls, reasoning, etc.) */
  output?: Item[];

  /** Conversation information (if stored) */
  conversation?: { id: string };

  /** Token usage */
  usage?: {
    input_tokens: number;
    output_tokens: number;
    total_tokens: number;
  };

  /** Optional configuration echoes and metadata */
  metadata?: Record<string, unknown>;
  background?: boolean;
  instructions?: string | unknown[];
  max_output_tokens?: number;
  max_tool_calls?: number;
  parallel_tool_calls?: boolean;
  previous_response_id?: string;
  prompt?: unknown;
  prompt_cache_key?: string;
  reasoning?: unknown;
  safety_identifier?: string;
  service_tier?: string;
  temperature?: number;
  text?: unknown;
  tool_choice?: unknown;
  tools?: unknown[];
  truncation?: string;
  top_p?: number;
  top_logprobs?: number;
};

export type Item =
  | OutputMessageItem
  | ToolCallItem
  | ToolCallOutputItem
  | ReasoningItem
  | FileSearchItem
  | CodeInterpreterItem
  | MCPItem
  | CustomToolItem;

export type OutputMessageItem = {
  id: string;
  type: 'output_message';
  role: 'assistant';
  status: string;
  content: Array<
    | {
        type: 'output_text';
        text: string;
        annotations: unknown[];
        logprobs?: unknown[];
      }
    | { type: 'refusal'; refusal: string }
  >;
};

Example response

Example Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1710000000,
  "model": "gpt-4.1-mini",
  "status": "completed",
  "output": [
    {
      "id": "msg_1",
      "type": "output_message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Hello from the Sudo Responses API!",
          "annotations": [],
          "logprobs": null
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 8,
    "total_tokens": 20
  }
}

Errors

Error envelopes are identical to the rest of the Sudo AI router (including /v1/chat/completions), for example:

Error

{
  "error": {
    "message": "Invalid API key",
    "type": "api_error"
  }
}

You can expect OpenAI-originated errors (for example invalid model, bad tools schema) to be surfaced through the same structure with appropriate HTTP status codes.

Streaming

When you set "stream": true in the request body, Sudo will stream Server‑Sent Events (SSE) that mirror the OpenAI Responses streaming format. Unlike /v1/chat/completions streaming, which sends delta chunks for choices[].delta, the Responses API streams typed events such as:

response.created: the response object has been created.
response.output_text.delta: incremental text tokens being generated.
response.output_text.done: the final text segment for an output item.
response.completed: the entire response has finished.
error: an error occurred while generating the response.

Each SSE event is a JSON object with a top-level type field. A typical stream might start like:

SSE events (truncated)

data: {"type":"response.created","response":{"id":"resp_abc123","status":"in_progress", ...}}

data: {
  "type": "response.output_text.delta",
  "delta": {
    "content": [
      {
        "type": "output_text_delta",
        "text": { "value": "Hello from the " }
      }
    ]
  }
}

data: {"type":"response.output_text.done", ...}

data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed", ...}}

data: [DONE]

Example: streaming `curl`

curl (streaming)

curl https://sudoapp.dev/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SUDO_API_KEY" \
  -N \
  -d '{
    "model": "gpt-4.1-mini",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_text",
            "text": "Write a short haiku about the Sudo Responses API."
          }
        ]
      }
    ],
    "stream": true
  }'

The SSE stream can be consumed using any EventSource/SSE client or your own HTTP streaming logic.

Rate limiting & behavior

Rate limits: The Responses API is subject to the same Sudo-side rate limiting as /v1/chat/completions (per‑app, per‑developer, and provider‑level throttling).
Error handling: Errors are normalized and returned using the same ErrorResponse shape as the rest of the AI router.

Because Sudo simply forwards to OpenAI’s Responses API under the hood, you automatically benefit from the latest OpenAI agent and orchestration features, while keeping a single Sudo integration (API key, billing, and monitoring) across all your AI usage.

​Requests

​Headers

​Minimal JSON example

​Example: non‑streaming curl

​Responses

​Example response

​Errors

​Streaming

​Example: streaming curl

​Rate limiting & behavior

Requests

Headers

Minimal JSON example

Example: non‑streaming `curl`

Responses

Example response

Errors

Streaming

Example: streaming `curl`

Rate limiting & behavior