Realtime (WebSocket & SSE)¶
Telbox pushes live updates over a WebSocket and streams token-by-token AI output over Server-Sent Events (SSE).
Two channels, two jobs
- WebSocket (
/v1/ws) — persistent, bidirectional, one per device. Carries every server→client event (new messages, read receipts, presence, call signaling, AI progress). - SSE — short-lived, unidirectional, one per request. Streams the output of a single AI call (translation, ask, assistant).
WebSocket¶
Connect¶
The server accepts the socket first, then authenticates the token. On
failure it closes with code 4401 and the error code as the
close reason. Pass the same access token you use for REST; refresh it before it
expires and reconnect.
Heartbeat¶
Send the text frame ping; the server replies pong. Use this to keep
intermediaries from idling the connection and to detect a half-open socket.
Event envelope¶
Every server→client message is JSON with this outer shape:
{
"type": "message.new",
"thread_id": "uuid",
"message_id": "uuid (optional)",
"workspace_id": "uuid (optional)",
"payload": { }
}
Switch on type; route on thread_id. UUID fields are bare strings.
Event catalog¶
type |
Direction | Meaning |
|---|---|---|
message.new |
S→C | A new message arrived in a thread (E2E-encrypted payload). |
message.read |
S→sender | Recipient marked your message(s) read. |
message_reactions |
S→C | Reaction set on a message changed. |
message_deleted |
S→C | A message was deleted. |
message.transcript_edited |
S→C | A voice-note transcript was edited. |
presence |
S→C | Typing / recording indicator for a thread. |
recording_progress |
S→C | Live progress while a voice note records/uploads. |
ai.stage_done |
S→C | One AI pipeline stage finished (triage/extract/summarize/replies). |
ai.processed |
S→C | The AI pipeline for a message completed. |
call.invited |
S→C | Incoming call invitation. |
call.state_changed |
S→C | Call transitioned (ringing→active→ended, etc.). |
call.signal |
S→C | WaveLink SDP/ICE handshake signal for a call. |
call.relay |
S→C | Relay coordination for a call. |
handoff.event |
S→sender | Live view / screenshot receipt on a hand-off message. |
Additional state pushes you may receive: subscription.lapsed,
voice_clone.status_changed, task_due, invite_redeemed. Treat any unknown
type as a no-op (forward-compatible) — the apps do.
Payloads are encrypted
message.new carries the E2E-encrypted message (encapsulated_key_b64,
iv, ciphertext, aad, sender_eph_public). Decrypt client-side with
your device keys. The server never sees plaintext on the transport path.
Reconnect & replay¶
The WebSocket is best-effort while connected; durable delivery is guaranteed by an outbox with at-least-once semantics and idempotency keys. After any disconnect:
- Reconnect the socket.
- Catch up on anything missed via REST:
GET /v1/messages?after=<last_seen_ts>.
Because events are idempotent by design (e.g. message.new is keyed by
message_id), replaying a window you partially saw is safe — dedupe on the id.
Server-Sent Events (SSE)¶
SSE responses set Content-Type: text/event-stream, Cache-Control: no-cache,
and X-Accel-Buffering: no. Each event is a data: line; the stream terminates
with a literal data: [DONE].
Streaming translation — POST /v1/messages/{message_id}/translate-stream¶
Request:
target_lang is an optional BCP-47 tag; it defaults to your
preferred_language. You must be a member of the message's thread.
Stream:
{"text": "…"}— one translated fragment per token.{"error": "upstream_failed"}— terminal upstream failure (still200, followed by[DONE]).- Client errors (
403,404,429) arrive as a normal HTTP error with no SSE body.
Streaming ask — POST /v1/ask/stream¶
Streams the answer to an Ask query token-by-token.
It may also emit a pending_action event carrying an HMAC-signed confirm card;
redeem it at POST /v1/agent/confirm-action. Terminates with [DONE].
Streaming assistant — POST /v1/threads/{thread_id}/assistant-invocations/stream¶
Drives the personal assistant with live tool-call chrome:
data: {"step": "tool_call_start", "name": "search_messages", "call_id": "c1"}
data: {"step": "tool_call_done", "name": "search_messages", "ok": true, "summary": "2 hit(s)"}
data: {"event": "result", "visibility": "private", "action": { }, "run": { }}
data: [DONE]
{"step": …}—tool_call_start,tool_call_done,done.{"action": {…}}— a write-tool confirm card (same token path as Ask).{"event": "result", …}— terminal result with the finalized action/run.{"event": "error", "error": "<code>"}— mid-stream failure (terminal).
transcribe-stream is removed
There is no streaming-transcription endpoint. Voice-note transcripts are
produced one-shot by the AI pipeline and delivered when ready — watch for
the ai.stage_done / ai.processed WebSocket events, or poll
GET /v1/messages/{id}.