Streaming
You can request streaming responses so tokens are sent as they are generated instead of waiting for the full reply.Enabling streaming
Setstream: true in the request body for chat completions:
Response format
The response is a stream of server-sent events (SSE). Each event is a line starting withdata: followed by JSON.
- Content events — Include partial content in the delta (e.g.
choices[0].delta.content). - End event — Sent as
data: [DONE]when the stream is finished.
Consuming the stream
- Use an HTTP client that supports streaming (e.g.
fetchwithresponse.body, or a library that handles SSE). - Parse each line: strip the
data:prefix and parse the JSON (or handle[DONE]). - Append
delta.contentto build the full reply. - When you see
[DONE], close the stream.