Advanced — HTTP API — AI Autocomplete Docs

Patterns the SDKs handle for you — and how to reproduce them when you're calling the endpoint directly.

Token refresh strategy

Tokens expire. The expected pattern, regardless of language or platform:

Mint a token from your backend by calling /api/auth/token with the secret key. Cache the access_token alongside its expires_at.
Reuse the cached token until you're within 30 seconds of expires_at, then mint a fresh one.
If multiple requests need a token at the same time, deduplicate so only one mint is in flight — pending requests wait for that single result.

401 retry

If a token is rejected mid-flight (clock skew, server-side revocation), force-refresh once and retry the original request. This mirrors what the SDKs do internally.

Managing requests during fast typing

Fast keystrokes can fire requests faster than the API can respond. The SDKs handle three concerns in concert; HTTP-direct callers should too.

Debounce keystrokes — Wait briefly after the last keystroke before sending — long enough to collapse rapid bursts, short enough that the user doesn't perceive the delay. Tune to your UX; 100–200ms works well for most autocomplete interfaces — lean toward 100ms when you want a snappier, more responsive feel, and toward 200ms to trim request volume during fast typing.
Cancel in-flight — When a newer request is about to fire, abort any in-flight request first. Use whichever request-cancellation mechanism your platform provides, or a request-tag scheme that lets you ignore late completions.
Discard stale responses — Track the request_id (or request_at) you sent on the most recent call. When a response arrives, compare its meta.request_id — if it doesn't match the latest, drop the response without applying it. Protects against network reordering even when cancellation works correctly.

PII masking

By default, each completed_params[].text carries the literal value the user picked (e.g. "alex@example.com"). To keep that out of server logs — for compliance or privacy — omit the text field on entries you want to mask. The request still validates with the placeholder and type intact, but the server no longer sees the specific value, which can reduce suggestion quality. Use selectively where the privacy benefit outweighs the ranking cost.

Session lifecycle

A session_id groups one user's autocomplete activity for analytics and event recording. Reuse the same value across keystrokes within one autocomplete interaction; mint a new one when:

The user commits — presses Enter, taps a Submit button, or any equivalent finalize action in your UI.
The user navigates away, closes the surface, or otherwise abandons the current input.
A long idle period elapses — there's no server-enforced TTL, so pick a window that matches your UX (e.g. 10–15 minutes since the last keystroke).

Choosing an auth mode

Pick the credential that matches where the call originates:

Public key — drop straight into a client application. Scoped and rate-limited; safe to ship in client bundles.
Secret key + access token — your server holds the secret, mints short-lived tokens, and the client only ever sees the token. Use for production: token leaks have a small blast radius.
Server-to-server — call /api/suggest directly with the secret key when there's no end-user client in the loop (back-office, batch jobs).