Sudo exposes the OpenAI Responses API at the v1/responses endpoint.
If you’re already using OpenAI’s Responses API you can usually swap the base URL, provide a Sudo API key, and you’re done.
Base URL: https://sudoapp.dev/api
Responses endpoint: /v1/responses
This interface only supports OpenAI models (for example gpt-4.1, gpt-4.1-mini, o3-mini, etc.).
Sudo forwards requests directly to OpenAI’s Responses API with one-to-one parity, which makes it ideal for agent-style workflows, tools-heavy orchestration, and modern OpenAI-first features that go beyond classic chat completions.
As with /v1/chat/completions, rate limiting and error handling are managed by Sudo’s AI router – behavior is identical to the canonical /v1/chat/completions endpoint.
Requests
You POST to /v1/responses with the same shape OpenAI expects.
Below is an example condensed TypeScript definition (non-exhaustive) that highlights the most commonly used fields:
export type ResponsesRequest = {
/** Required: OpenAI model identifier, e.g. `gpt-4.1-mini` */
model: string;
/**
* Core input to the model.
* Can be a simple string, or an array of structured items (messages, tool calls, etc.).
*/
input?: ResponsesInput;
/** High-level instructions / system prompt for the model. */
instructions?: string | ResponseInstructionItem[];
/** Enable or disable automatic tool calling. */
tool_choice?: 'none' | 'auto' | 'required' | ToolChoiceObject;
/** Tools (functions, file search, code interpreter, etc.) available to the model. */
tools?: Tool[];
/** Common generation params */
temperature?: number;
top_p?: number;
max_output_tokens?: number;
/** Agent / orchestration features */
reasoning?: unknown;
service_tier?: 'auto' | 'default' | string;
metadata?: Record<string, unknown>;
store?: boolean;
conversation?: string | ConversationConfig;
previous_response_id?: string;
/** Streaming */
stream?: boolean;
stream_options?: unknown;
/** Other advanced options from OpenAI are also supported 1:1 */
truncation?: string;
max_tool_calls?: number;
top_logprobs?: number;
prompt_cache_key?: string;
};
export type ResponsesInput =
| string
| InputItem[];
export type InputItem =
| {
role: 'user' | 'assistant' | 'system';
content: MessageContent;
}
| StructuredItem
| ItemReference;
export type MessageContent =
| string
| Array<
| { type: 'input_text'; text: string }
| {
type: 'input_image';
image_url?: string;
file_id?: string;
detail?: 'low' | 'high' | 'auto';
}
| {
type: 'input_file';
file_id?: string;
file_url?: string;
filename?: string;
}
| {
type: 'input_audio';
input_audio: { data: string; format: string };
}
>;
For a fully detailed schema (including the richer Item types for tools, MCP, computer use, etc.), see the official OpenAI Responses API reference.
Only two headers are required:
| Header | Value |
|---|
Authorization | Bearer <SUDO_API_KEY> |
Content-Type | application/json |
Minimal JSON example
The smallest useful request can be just a model and a string input:
{
"model": "gpt-4.1-mini",
"input": "Say hello from the Sudo Responses API."
}
For richer agent-style prompts you typically send an array of input items:
{
"model": "gpt-4.1-mini",
"input": [
{
"role": "system",
"content": [
{
"type": "input_text",
"text": "You are an assistant that answers in JSON."
}
]
},
{
"role": "user",
"content": [
{
"type": "input_text",
"text": "Return a todo list for moving apartments."
}
]
}
],
"tool_choice": "auto",
"tools": [
{
"type": "function",
"name": "create_task",
"description": "Create a task in the user's task manager",
"parameters": {
"type": "object",
"properties": {
"title": { "type": "string" },
"due_date": { "type": "string", "format": "date-time" }
},
"required": ["title"]
}
}
]
}
Example: non‑streaming curl
curl https://sudoapp.dev/api/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $SUDO_API_KEY" \
-d '{
"model": "gpt-4.1-mini",
"input": "Say hello from the Sudo Responses API."
}'
Responses
Sudo returns the same schema as the OpenAI Responses API, including the response object and output items.
Response Schema (simplified)
export type Response = {
id: string;
object: 'response';
created_at: number;
model: string;
/** Overall status of the response */
status:
| 'in_progress'
| 'completed'
| 'failed'
| 'cancelled'
| 'incomplete';
/** Generated output items (messages, tool calls, reasoning, etc.) */
output?: Item[];
/** Conversation information (if stored) */
conversation?: { id: string };
/** Token usage */
usage?: {
input_tokens: number;
output_tokens: number;
total_tokens: number;
};
/** Optional configuration echoes and metadata */
metadata?: Record<string, unknown>;
background?: boolean;
instructions?: string | unknown[];
max_output_tokens?: number;
max_tool_calls?: number;
parallel_tool_calls?: boolean;
previous_response_id?: string;
prompt?: unknown;
prompt_cache_key?: string;
reasoning?: unknown;
safety_identifier?: string;
service_tier?: string;
temperature?: number;
text?: unknown;
tool_choice?: unknown;
tools?: unknown[];
truncation?: string;
top_p?: number;
top_logprobs?: number;
};
export type Item =
| OutputMessageItem
| ToolCallItem
| ToolCallOutputItem
| ReasoningItem
| FileSearchItem
| CodeInterpreterItem
| MCPItem
| CustomToolItem;
export type OutputMessageItem = {
id: string;
type: 'output_message';
role: 'assistant';
status: string;
content: Array<
| {
type: 'output_text';
text: string;
annotations: unknown[];
logprobs?: unknown[];
}
| { type: 'refusal'; refusal: string }
>;
};
Example response
{
"id": "resp_abc123",
"object": "response",
"created_at": 1710000000,
"model": "gpt-4.1-mini",
"status": "completed",
"output": [
{
"id": "msg_1",
"type": "output_message",
"role": "assistant",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "Hello from the Sudo Responses API!",
"annotations": [],
"logprobs": null
}
]
}
],
"usage": {
"input_tokens": 12,
"output_tokens": 8,
"total_tokens": 20
}
}
Errors
Error envelopes are identical to the rest of the Sudo AI router (including /v1/chat/completions), for example:
{
"error": {
"message": "Invalid API key",
"type": "api_error"
}
}
You can expect OpenAI-originated errors (for example invalid model, bad tools schema) to be surfaced through the same structure with appropriate HTTP status codes.
Streaming
When you set "stream": true in the request body, Sudo will stream Server‑Sent Events (SSE) that mirror the OpenAI Responses streaming format.
Unlike /v1/chat/completions streaming, which sends delta chunks for choices[].delta, the Responses API streams typed events such as:
response.created: the response object has been created.
response.output_text.delta: incremental text tokens being generated.
response.output_text.done: the final text segment for an output item.
response.completed: the entire response has finished.
error: an error occurred while generating the response.
Each SSE event is a JSON object with a top-level type field. A typical stream might start like:
data: {"type":"response.created","response":{"id":"resp_abc123","status":"in_progress", ...}}
data: {
"type": "response.output_text.delta",
"delta": {
"content": [
{
"type": "output_text_delta",
"text": { "value": "Hello from the " }
}
]
}
}
data: {"type":"response.output_text.done", ...}
data: {"type":"response.completed","response":{"id":"resp_abc123","status":"completed", ...}}
data: [DONE]
Example: streaming curl
curl https://sudoapp.dev/api/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $SUDO_API_KEY" \
-N \
-d '{
"model": "gpt-4.1-mini",
"input": [
{
"role": "user",
"content": [
{
"type": "input_text",
"text": "Write a short haiku about the Sudo Responses API."
}
]
}
],
"stream": true
}'
The SSE stream can be consumed using any EventSource/SSE client or your own HTTP streaming logic.
Rate limiting & behavior
- Rate limits: The Responses API is subject to the same Sudo-side rate limiting as
/v1/chat/completions (per‑app, per‑developer, and provider‑level throttling).
- Error handling: Errors are normalized and returned using the same
ErrorResponse shape as the rest of the AI router.
Because Sudo simply forwards to OpenAI’s Responses API under the hood, you automatically benefit from the latest OpenAI agent and orchestration features, while keeping a single Sudo integration (API key, billing, and monitoring) across all your AI usage.