How Does the AI Typewriter Effect Work?
Anyone who has used ChatGPT or similar AI assistants has seen it: the response doesn't appear all at once — it streams out character by character, as if someone is typing in real time.
The technology behind this effect is called SSE (Server-Sent Events). Understanding SSE not only explains how the AI typewriter effect works, but opens the door to building real-time push notifications, live dashboards, stock tickers, and more in your own projects.
What Is SSE?
SSE is a one-way, real-time communication protocol built on HTTP. It allows a server to push data to a client proactively, without the client repeatedly polling for updates.
The mechanics are straightforward: the client makes a normal HTTP request, and instead of closing the connection after responding, the server keeps it open and continuously writes new data — pushing one message each time new content is available, until the stream ends.
Client Server
| |
|--- GET /stream ------------->|
| |
|<-- data: first message ------|
|<-- data: second message -----|
|<-- data: third message ------|
| ... |
|<-- data: [DONE] -------------| connection closes
The SSE Data Format
SSE uses a simple text format. Each message consists of one or more fields separated by newlines; messages are separated by blank lines:
data: This is the first message
data: This is the second message
event: update
data: {"type":"progress","value":80}
id: 42
retry: 3000
Field reference:
| Field | Description |
|---|---|
data |
Message content — required; can span multiple lines |
event |
Custom event type; defaults to message |
id |
Message ID — tells the server where to resume after a reconnect |
retry |
Reconnection delay in milliseconds |
The Content-Type must be text/event-stream — otherwise the browser won't parse the response as SSE.
What Does AI Streaming Output Look Like in SSE?
The streaming APIs from OpenAI, Qwen, and similar models use exactly this SSE format. Each generated token is pushed immediately as it's produced:
data: {"id":"1","choices":[{"delta":{"content":"Hi"},"finish_reason":null}]}
data: {"id":"2","choices":[{"delta":{"content":" there"},"finish_reason":null}]}
data: {"id":"3","choices":[{"delta":{"content":"!"},"finish_reason":null}]}
data: [DONE]
The frontend receives each data event, parses the JSON, extracts choices[0].delta.content, and appends it to the UI — producing the typewriter effect.
When debugging these AI streaming APIs, if you want to watch each SSE event in real time and automatically extract specific JSON fields, the SSE Online Debug & Parser tool lets you enter the endpoint URL and request headers (such as an Authorization token) and observe the stream directly in your browser.
Receiving SSE in the Browser
The browser provides the native EventSource API specifically for consuming SSE streams:
const source = new EventSource('/api/stream');
// Receive default message events
source.onmessage = (event) => {
console.log('Received:', event.data);
};
// Receive custom event types
source.addEventListener('update', (event) => {
const data = JSON.parse(event.data);
console.log('Progress:', data.value);
});
// Error handling
source.onerror = (error) => {
console.error('Connection error', error);
source.close();
};
Limitation: EventSource only supports GET requests and cannot set custom headers. When you need POST or an Authorization header (as with most AI APIs), use fetch with ReadableStream instead:
const response = await fetch('/api/stream', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your-token'
},
body: JSON.stringify({ prompt: 'Hello' })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
for (const line of text.split('\n')) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
console.log(JSON.parse(data));
}
}
}
Implementing SSE on the Backend
Python (FastAPI)
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import asyncio
app = FastAPI()
async def event_generator():
for i in range(5):
yield f"data: Message {i+1}\n\n"
await asyncio.sleep(1)
yield "data: [DONE]\n\n"
@app.get("/stream")
async def stream():
return StreamingResponse(
event_generator(),
media_type="text/event-stream"
)
Node.js (Express)
app.get('/stream', (req, res) => {
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
let count = 0;
const interval = setInterval(() => {
res.write(`data: Message ${++count}\n\n`);
if (count >= 5) {
res.write('data: [DONE]\n\n');
res.end();
clearInterval(interval);
}
}, 1000);
req.on('close', () => clearInterval(interval));
});
SSE vs WebSocket: Which One to Use?
This is the most common question. Both enable real-time communication, but they serve different needs:
| Comparison | SSE | WebSocket |
|---|---|---|
| Direction | One-way (server → client) | Bidirectional |
| Protocol | Built on HTTP | Separate protocol (ws://) |
| Auto-reconnect | Built into the browser | Must implement manually |
| Implementation complexity | Simple | More involved |
| Browser support | All modern browsers | All modern browsers |
| Best for | Push notifications, AI streaming, live logs | Games, chat rooms, collaborative editing |
| Proxy / CDN compatibility | Easy (standard HTTP) | Requires ws-aware proxy |
The decision rule: if you only need the server to push data to the client, use SSE — it's simpler, works over standard HTTP, and requires no special server or proxy configuration. Only reach for WebSocket when clients also need to send messages back to the server in real time.
For most AI streaming output use cases, SSE is more than sufficient.
Common Pitfalls
1. Nginx proxy buffering kills real-time delivery
When SSE passes through Nginx with proxy buffering enabled, data accumulates until a buffer fills before being sent — destroying the real-time nature of the stream. Disable buffering in your config:
location /stream {
proxy_pass http://backend;
proxy_buffering off;
proxy_cache off;
proxy_set_header Connection '';
proxy_http_version 1.1;
chunked_transfer_encoding on;
}
2. HTTP/2 behavior differences
HTTP/2 natively supports multiplexing, so SSE over HTTP/2 performs better and doesn't consume an extra TCP connection. That said, some older reverse proxies handle HTTP/2 + SSE poorly — if you hit issues, try forcing HTTP/1.1.
3. Browser connection limits
Browsers cap HTTP/1.1 connections per domain (typically 6). Multiple open SSE connections on the same page can exhaust this limit and block other requests. HTTP/2 eliminates this constraint entirely.
Article URL:https://toolshu.com/en/article/sse-explained
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License 。



Loading...