Rate Limits & Quotas¶
Telbox enforces three independent throttles. They protect different things and return different signals:
- Per-IP rate limits on sensitive unauthenticated/abuse-prone routes.
- Per-actor token buckets (e.g. OTP per phone number, media bytes per day).
- AI per-feature monthly quotas + a dollar backstop on the free tier.
All three return 429 (AI budget returns 402). Honour the
Retry-After header where present, and back off. See Errors.
Per-IP rate limits¶
| Endpoint(s) | Limit |
|---|---|
POST /v1/auth/phone/start |
10 / hour |
POST /v1/auth/phone/verify |
30 / hour |
POST /v1/auth/sign-up |
5 / hour |
POST /v1/contacts/match |
30 / hour |
| invite tracking | 200 / day |
POST /v1/moderation/* |
20 / hour |
POST /v1/forensics/verify |
30 / hour |
GET /i/{ref} (invite landing) |
60 / minute |
| anonymous voice drop | 5 / hour |
Per-actor token buckets¶
Beyond per-IP limits, abuse-prone actions are metered per actor:
- OTP per phone number — over-requesting a code or verification returns
429with the codeotp_rate_limitedand aRetry-Afterheader. - Media upload — capped at 1 GiB/day per user.
- Call signaling and segment uploads have their own buckets.
AI quotas (free tier)¶
The transport is always free and unmetered. AI processing on the free tier is metered by per-feature monthly quotas (a token bucket with burst capacity), plus a silent $0.10/month dollar backstop. Paid tiers are uncapped on both.
| Feature | Burst | ~Monthly | 429 error code |
|---|---|---|---|
ai.voice_note (transcribe + understand) |
20 | ~100 | ai_quota_exceeded_voice_note |
ai.ask (RAG over your history) |
2 | ~5 | ai_quota_exceeded_ask |
ai.thread_assistant (per turn) |
3 | ~10 | ai_quota_exceeded_thread_assistant |
ai.insights_rerun (re-run insights) |
1 | ~3 | ai_quota_exceeded_insights_rerun |
ai.brief (reserved) |
1 | ~7 | ai_quota_exceeded_brief |
When a feature bucket empties, the call returns 429 with the feature-specific
code and a Retry-After. When the dollar backstop trips first, the call returns
402 ai_budget_exceeded. Either way, the user-facing move is upgrade to a paid
tier (uncapped), or wait for the bucket to refill.
Reading remaining quota¶
For free-tier users, GET /v1/me includes an ai_feature_quotas object:
{
"ai_feature_quotas": {
"ask": { "remaining": 1, "capacity": 2, "monthly_allowance": 5 },
"voice_note": { "remaining": 18, "capacity": 20, "monthly_allowance": 100 }
}
}
It is null for paid tiers (uncapped). Surface a meter from remaining; when
it drops to ≤ 20 %, prompt the upgrade path.
Client guidance¶
- Respect
Retry-After. It is authoritative; prefer it to your own backoff. - Distinguish the AI codes.
429 ai_quota_exceeded_*and402 ai_budget_exceededare not transient infrastructure throttles — retrying immediately won't help. Route the user to upgrade or wait for refill. - Pre-check before expensive UI. Read
ai_feature_quotasto disable or warn before the user spends their lastask.