Writing requirement for Response Time from Hosted Server

Question

We are contracting with a group to create some cloud services for us, and would like to write a requirement for them stating that a particular API call will provide results within X seconds. They are pushing back, saying such a requirement can't be guaranteed because the problem may be in the Internet connection and not in their servers, which is a reasonable point. But getting a timely response for this API is vital for our product, so we are loathe to have no requirement.

Is there a way to write a requirement around service response time that takes into account Internet "weather"?

score 4 · Answer 1 · answered Oct 27 '17 at 16:01

It is reasonable to write non-functional requirements like this. But your partners are right that this isn't trivial to measure.

It is important to keep in mind how networks work. To make a REST API call you may need to perform DNS lookup, perform a TLS handshake, transmit a request, wait until the request is processed, and receive the response. All those steps take time. Aside from the processing of the server, these times largely depend on the network latency (colloquially the “ping”) and to some degree on the available bandwidth.

For the purpose of a requirement, we can now define a test scenario where those variables are fixed. E.g. we can a assume: A pretty atrocious round-trip ping of 400ms. Transmitted data has negligible size. A connection has already been established. We can then phrase a requirement such as “On a network connection with a round-trip time of no more than 400ms, the time to first byte for any request is less than 450ms”. That leaves 50ms for the server software to do its work.

Alternatively, we can assume that the timings are taken directly in front of the target server, so that the network is negligible. The response times could then be collected continuously and be displayed on a dashboard. We could phrase this requirement as “Ignoring any network effects, the server responds to any request within 50ms”.

Sometimes, exceptional circumstances occur, for example packet loss. So it is not reasonable to demand that this requirement is met for every connection, just for the majority of connections. Using an average response time is not sufficient because many requests may still experience very long response times, but they would be averaged away. Instead, it is common to use a percentile. Using the 95% or 99% percentile is common. Since this is a statistical metric you need a sampling window, for example 1 minute. We could now clarify this requirement:

Over any time window of 60 seconds, the 95th percentile response time of the server must be below 50ms. This is measurement ignores any network effects such as latency.

Note that the number of requests over that time period must be high enough that the chosen percentile is meaningful. E.g. if that time period only sees 5 requests, then the 95th percentile response time is more or less the same as the worst response time, but that metric is very sensitive to outliers and thus unsuitable. You want more than a handful of events above the chosen percentile for this metric to be meaningful. To get this you can either increase the sampling time to get more events or decrease the chosen percentile, but both actions decrease the sensitivity of this metric.

Specifying measurable response times is important because it is now possible to determine if there's degraded service or an outage. E.g. response times of 80 seconds are absolutely unacceptable for most use cases. While the service might technically be working, it does not satisfy your needs. When the service fails to meet the required response time, that is downtime that would count against your agreed upon uptime / service level agreement.

The recent article [Can You Afford It?: Real-world Web Performance Budgets](https://infrequently.org/2017/10/can-you-afford-it-real-world-web-performance-budgets/) by Alex Russel discusses some aspects of such measurements, especially regarding latency and transmission time. Well worth a read! However, it is focussed on mobile browsers and UX, not on APIs and service level agreements. Compared to this answer, it provides more background on per-request overhead and business value of clear performance budgets. — amon, Oct 27 '17 at 17:46

score 0 · Answer 2 · answered Oct 27 '17 at 16:10

I recently came across a need to measure the response time of one of my API's, and the approach I took was to measure from the time the request is made to the services REST API through the time the http response is sent back. This will make the client browser, PC, and/or internet between your service and the end user not a factor.

Writing requirement for Response Time from Hosted Server

2 Answers2