What is Latency?

Latency is the delay between a user’s action and the system returning a meaningful response. In web applications, it is affected by network travel time, DNS lookup, TLS handshake, server processing, database queries, and browser rendering.

Latency is usually more useful when tracked as p50, p95, and p99 rather than only as an average. A small percentage of slow requests can still hurt checkout, search, or admin workflows. Low latency is not only about a fast server; user geography, cache strategy, third-party services, and payload size all contribute to the final experience.

How It Is Reduced

Serving static assets from a CDN location close to the user
Reducing network requests and shrinking unnecessary JavaScript or large JSON responses
Improving database queries, indexes, and connection pool settings
Adding timeouts, retries, and caching around external API calls

Latency should be measured end to end; backend duration alone does not represent real user experience. CDN helps with content delivery, edge computing moves work closer to users, and TTFB measures the first server response.

What is Latency?

How It Is Reduced

Related Terms