load testing - spike vs stress to determine performance

Question

During the course of my load testing journey, I'm doing mostly spike test on the SUT(System Under Test). We gradually dial up the number of users hitting the server, from 10, 100, 1000. And we log the process execution time separately from the response time, and these are the kind of results we obtain say for a spike of 1000 users, at a single point of time:

response time: 0.1s   api execution time: 0.3s
response time 0.11s   api execution time: 0.4s
response time 0.15s   api execution time: 0.3s
response time 0.7s    api execution time: 0.5s
response time 0.9s    api execution time: 0.3s

... 500th request processed by the server
response time: 7s       api execution time: 0.6s
response time: 7.1s     api execution time: 0.5s
response time: 7.2s     api execution time: 0.4s

... 1000th request processed by the server
response time: 11s        api execution time: 0.6s
response time: 11.1s      api execution time: 0.5s

We speculate the increasing response time due to the queuing up of requests on the server.
As we see in the observations above, the execution time fluctuates around the same amount of time, and it's this queuing up of request that's giving us a seemingly bad response time.

So, what I'm confused on is if a spike test is actually a performance test. To me it seems more like a concurrency test in which we're trying to see if a system has some rate of failure and logging response, seems to me, doesn't have more value than a SUT on continuous expected load. These are just my ideas and I'd like more views on this point.

So, if you'd log the response times for a SUT, on a spike test and a stress test, how would you interpret those two different sets of data? And which test/test data would be better suitable to speculate and determine and benchmark the response times for a SUT?

"And which test/test data would be better suitable to speculate and determine and benchmark the response times for a SUT?" The one which most closely matches your expected production workload. We can't tell you that. — Philip Kendall, Feb 10 '23 at 08:18
@PhilipKendall , I understand it may differentiate how we assume testing, as mentioned some scnerios by Dmitri below, but, I wanted to know of the most generic and usual sense, as closely as possible to what a performance of a SUT could possibly be defined as, not in a relative manner, but, it in itself as an isolated methodology — juztcode, Feb 10 '23 at 10:52

score 1 · Answer 1 · answered Feb 10 '23 at 08:42

These 2 are absolutely different test types which serve different purposes.

The main performance testing types I would look at:

If you have some form of NFR or SLA and need a formal "signoff" - go for Load Testing when the system is being put under anticipated load and compare actual response time and throughput with the ones from the requirements
Stress Testing is about identifying the bottleneck, you start with 1 user and gradually increase the load until response time starts exceeding acceptable or errors start occurring, whatever comes the first. On an ideal system the number of requests per second (throughput) should increase aligned with the increased load. But at some point you will see that the load increases and the throughput remains the same or goes down due to increased response time. At this point you will know how many users/requests per second you application can serve without performance degradation (this is known as saturation point)
Spike Testing is rather exotic scenario which checks how does the system under test handles sudden spikes and whether it gets back to normal when the spike ends. It is not too representative, just another layer or protection just in case. Similar is Soak Testing which is pretty much the same as Load Testing but for prolonged period of time, it allows to detect i.e. memory leaks

More information: Performance Testing vs. Load Testing vs. Stress Testing

Looking at your numbers, given that

with 1 user response time is around 100 ms
with 500 users response time is around 7 s

the bottleneck must be somewhere in between, take another look at test results and mention how many users where online when response time started exceeding 100 ms

load testing - spike vs stress to determine performance

1 Answers1