Online Tools Toolshu.com Log In Sign Up

SSE Explained: The Tech Behind AI Typewriter Streaming, and How It Differs from WebSocket

Original Author:bhnw Released on 2026-04-09 09:48 3 views Star (0)

How Does the AI Typewriter Effect Work?

Anyone who has used ChatGPT or similar AI assistants has seen it: the response doesn't appear all at once — it streams out character by character, as if someone is typing in real time.

The technology behind this effect is called SSE (Server-Sent Events). Understanding SSE not only explains how the AI typewriter effect works, but opens the door to building real-time push notifications, live dashboards, stock tickers, and more in your own projects.


What Is SSE?

SSE is a one-way, real-time communication protocol built on HTTP. It allows a server to push data to a client proactively, without the client repeatedly polling for updates.

The mechanics are straightforward: the client makes a normal HTTP request, and instead of closing the connection after responding, the server keeps it open and continuously writes new data — pushing one message each time new content is available, until the stream ends.

Client                        Server
  |                              |
  |--- GET /stream ------------->|
  |                              |
  |<-- data: first message ------|
  |<-- data: second message -----|
  |<-- data: third message ------|
  |           ...                |
  |<-- data: [DONE] -------------|  connection closes

The SSE Data Format

SSE uses a simple text format. Each message consists of one or more fields separated by newlines; messages are separated by blank lines:

data: This is the first message

data: This is the second message

event: update
data: {"type":"progress","value":80}
id: 42
retry: 3000

Field reference:

Field Description
data Message content — required; can span multiple lines
event Custom event type; defaults to message
id Message ID — tells the server where to resume after a reconnect
retry Reconnection delay in milliseconds

The Content-Type must be text/event-stream — otherwise the browser won't parse the response as SSE.


What Does AI Streaming Output Look Like in SSE?

The streaming APIs from OpenAI, Qwen, and similar models use exactly this SSE format. Each generated token is pushed immediately as it's produced:

data: {"id":"1","choices":[{"delta":{"content":"Hi"},"finish_reason":null}]}

data: {"id":"2","choices":[{"delta":{"content":" there"},"finish_reason":null}]}

data: {"id":"3","choices":[{"delta":{"content":"!"},"finish_reason":null}]}

data: [DONE]

The frontend receives each data event, parses the JSON, extracts choices[0].delta.content, and appends it to the UI — producing the typewriter effect.

When debugging these AI streaming APIs, if you want to watch each SSE event in real time and automatically extract specific JSON fields, the SSE Online Debug & Parser tool lets you enter the endpoint URL and request headers (such as an Authorization token) and observe the stream directly in your browser.


Receiving SSE in the Browser

The browser provides the native EventSource API specifically for consuming SSE streams:

const source = new EventSource('/api/stream');

// Receive default message events
source.onmessage = (event) => {
  console.log('Received:', event.data);
};

// Receive custom event types
source.addEventListener('update', (event) => {
  const data = JSON.parse(event.data);
  console.log('Progress:', data.value);
});

// Error handling
source.onerror = (error) => {
  console.error('Connection error', error);
  source.close();
};

Limitation: EventSource only supports GET requests and cannot set custom headers. When you need POST or an Authorization header (as with most AI APIs), use fetch with ReadableStream instead:

const response = await fetch('/api/stream', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer your-token'
  },
  body: JSON.stringify({ prompt: 'Hello' })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const text = decoder.decode(value);
  for (const line of text.split('\n')) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') return;
      console.log(JSON.parse(data));
    }
  }
}

Implementing SSE on the Backend

Python (FastAPI)

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import asyncio

app = FastAPI()

async def event_generator():
    for i in range(5):
        yield f"data: Message {i+1}\n\n"
        await asyncio.sleep(1)
    yield "data: [DONE]\n\n"

@app.get("/stream")
async def stream():
    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream"
    )

Node.js (Express)

app.get('/stream', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  let count = 0;
  const interval = setInterval(() => {
    res.write(`data: Message ${++count}\n\n`);
    if (count >= 5) {
      res.write('data: [DONE]\n\n');
      res.end();
      clearInterval(interval);
    }
  }, 1000);

  req.on('close', () => clearInterval(interval));
});

SSE vs WebSocket: Which One to Use?

This is the most common question. Both enable real-time communication, but they serve different needs:

Comparison SSE WebSocket
Direction One-way (server → client) Bidirectional
Protocol Built on HTTP Separate protocol (ws://)
Auto-reconnect Built into the browser Must implement manually
Implementation complexity Simple More involved
Browser support All modern browsers All modern browsers
Best for Push notifications, AI streaming, live logs Games, chat rooms, collaborative editing
Proxy / CDN compatibility Easy (standard HTTP) Requires ws-aware proxy

The decision rule: if you only need the server to push data to the client, use SSE — it's simpler, works over standard HTTP, and requires no special server or proxy configuration. Only reach for WebSocket when clients also need to send messages back to the server in real time.

For most AI streaming output use cases, SSE is more than sufficient.


Common Pitfalls

1. Nginx proxy buffering kills real-time delivery

When SSE passes through Nginx with proxy buffering enabled, data accumulates until a buffer fills before being sent — destroying the real-time nature of the stream. Disable buffering in your config:

location /stream {
    proxy_pass http://backend;
    proxy_buffering off;
    proxy_cache off;
    proxy_set_header Connection '';
    proxy_http_version 1.1;
    chunked_transfer_encoding on;
}

2. HTTP/2 behavior differences

HTTP/2 natively supports multiplexing, so SSE over HTTP/2 performs better and doesn't consume an extra TCP connection. That said, some older reverse proxies handle HTTP/2 + SSE poorly — if you hit issues, try forcing HTTP/1.1.

3. Browser connection limits

Browsers cap HTTP/1.1 connections per domain (typically 6). Multiple open SSE connections on the same page can exhaust this limit and block other requests. HTTP/2 eliminates this constraint entirely.

发现周边 发现周边
Comment area

Loading...