All Sudo models support server-sent event (SSE) streaming.
Add "stream": true to your request body and consume the event-stream as it arrives.

Example – Python

Python
import requests, json

url = "https://sudoapp.dev/api/v1/chat/completions"
headers = {
    "Authorization": "Bearer <SUDO_API_KEY>",
    "Content-Type": "application/json",
}

payload = {
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": True,
}

buffer = ""
with requests.post(url, headers=headers, json=payload, stream=True) as r:
    for chunk in r.iter_content(chunk_size=1024, decode_unicode=True):
        buffer += chunk
        while "\n" in buffer:
            line, buffer = buffer.split("\n", 1)
            line = line.strip()
            if not line.startswith("data: "):
                # Comments (begin with :) can be ignored per the SSE spec
                continue
            data = line[6:]
            if data == "[DONE]":
                break
            message = json.loads(data)
            content_delta = message["choices"][0]["delta"].get("content")
            if content_delta:
                print(content_delta, end="", flush=True)

Example – TypeScript / browser fetch

TypeScript
const response = await fetch('https://sudoapp.dev/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${SUDO_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Tell me a joke' }],
    stream: true,
  }),
});

const reader = response.body?.getReader();
if (!reader) throw new Error('Streaming unsupported');

const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });

  while (true) {
    const nl = buffer.indexOf('\n');
    if (nl === -1) break;
    const line = buffer.slice(0, nl).trim();
    buffer = buffer.slice(nl + 1);
    if (!line.startsWith('data: ')) continue; // Ignore comments
    const data = line.slice(6);
    if (data === '[DONE]') break;
    const parsed = JSON.parse(data);
    const delta = parsed.choices[0].delta.content;
    if (delta) {
      console.log(delta);
    }
  }
}

About the SSE payload

  1. Each line that begins with data: contains a JSON chunk.
  2. Lines beginning with : are comments and can be safely ignored.
  3. A final line data: [DONE] signals the end of the stream.

Cancelling streams

Simply abort or close the underlying HTTP connection (e.g. via AbortController in the browser). The request will stop; however, you may be billed for the tokens of the full stream.