Encyclopedia Evalica / Observability / Time-to-first-token (TTFT)
Time-to-first-token (TTFT)
/teyem too ferst 'toh.kuhn tee.tee.ef.tee/The latency until the first output token arrives in a streaming response. TTFT heavily influences perceived responsiveness. (noun)
“TTFT worsened after we changed model providers.”
Related Observability terms
- Active observability •
- AI observability •
- Alert / threshold •
- Dashboard •
- Data flywheel •
- Deep search •
- Drift •
- Error rate •
- Feedback loop •
- Logs •
- Model drift •
- Online evaluation (production scoring) •
- P50 / P95 / P99 (Percentiles) •
- Sampling rate •
- Service Level Indicator (SLI) •
- Service Level Objective (SLO) •
- Token usage / cost tracking •
- Topics
From the docs
Get started with Evals
Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.
Start building