Skip to content

Rate Limits & Quotas

Telbox enforces three independent throttles. They protect different things and return different signals:

  1. Per-IP rate limits on sensitive unauthenticated/abuse-prone routes.
  2. Per-actor token buckets (e.g. OTP per phone number, media bytes per day).
  3. AI per-feature monthly quotas + a dollar backstop on the free tier.

All three return 429 (AI budget returns 402). Honour the Retry-After header where present, and back off. See Errors.

Per-IP rate limits

Endpoint(s) Limit
POST /v1/auth/phone/start 10 / hour
POST /v1/auth/phone/verify 30 / hour
POST /v1/auth/sign-up 5 / hour
POST /v1/contacts/match 30 / hour
invite tracking 200 / day
POST /v1/moderation/* 20 / hour
POST /v1/forensics/verify 30 / hour
GET /i/{ref} (invite landing) 60 / minute
anonymous voice drop 5 / hour

Per-actor token buckets

Beyond per-IP limits, abuse-prone actions are metered per actor:

  • OTP per phone number — over-requesting a code or verification returns 429 with the code otp_rate_limited and a Retry-After header.
  • Media upload — capped at 1 GiB/day per user.
  • Call signaling and segment uploads have their own buckets.

AI quotas (free tier)

The transport is always free and unmetered. AI processing on the free tier is metered by per-feature monthly quotas (a token bucket with burst capacity), plus a silent $0.10/month dollar backstop. Paid tiers are uncapped on both.

Feature Burst ~Monthly 429 error code
ai.voice_note (transcribe + understand) 20 ~100 ai_quota_exceeded_voice_note
ai.ask (RAG over your history) 2 ~5 ai_quota_exceeded_ask
ai.thread_assistant (per turn) 3 ~10 ai_quota_exceeded_thread_assistant
ai.insights_rerun (re-run insights) 1 ~3 ai_quota_exceeded_insights_rerun
ai.brief (reserved) 1 ~7 ai_quota_exceeded_brief

When a feature bucket empties, the call returns 429 with the feature-specific code and a Retry-After. When the dollar backstop trips first, the call returns 402 ai_budget_exceeded. Either way, the user-facing move is upgrade to a paid tier (uncapped), or wait for the bucket to refill.

Reading remaining quota

For free-tier users, GET /v1/me includes an ai_feature_quotas object:

{
  "ai_feature_quotas": {
    "ask":        { "remaining": 1, "capacity": 2, "monthly_allowance": 5 },
    "voice_note": { "remaining": 18, "capacity": 20, "monthly_allowance": 100 }
  }
}

It is null for paid tiers (uncapped). Surface a meter from remaining; when it drops to ≤ 20 %, prompt the upgrade path.

Client guidance

  • Respect Retry-After. It is authoritative; prefer it to your own backoff.
  • Distinguish the AI codes. 429 ai_quota_exceeded_* and 402 ai_budget_exceeded are not transient infrastructure throttles — retrying immediately won't help. Route the user to upgrade or wait for refill.
  • Pre-check before expensive UI. Read ai_feature_quotas to disable or warn before the user spends their last ask.