When only load testing will help

Short description

Load testing is an important part of IT development, ensuring that online services can handle the expected flow of users and transactions. It involves simulating a high volume of transactions or requests to see how the system performs under pressure. Load testing specialists use a specialized stand and infrastructure to analyze the system, write scripts, and configure monitoring. Performance defects may be found, which need to be corrected. Load testing is a continuous process required for stressed systems, and it ensures that services can handle the expected load and prevent functional and performance defects.

When only load testing will help

In the world of IT, evolutionary changes are constantly taking place, which put such aspects as security and reliability of solutions as a basis.

Such changes, in particular, lead to the development of services that must work with a large number of users: online banking, marketplaces, ticket purchase and reservation services, contactless payment through POS terminals (remember the last time you paid in cash at the cash register?), payment metro using Face ID, requesting and receiving information at State Services, etc.

Qualitatively creating and developing such services is difficult, besides, in addition to understandable concerns about convenience, as well as security and preservation of personal data, there are unobvious risks associated with the growth of the number of users. This is where load testing comes in handy. We will talk about it below, as well as analyze specific life examples.


Let’s analyze a specific case: excellent customer-oriented service

Formulation of the problem:

The online service should work 24/7.

The question immediately arises, will he really be able to work in this mode? And if there are 2-3 times more users? And if the amount of information transmitted increases 5-10 times?

Your expectations from the service are as follows: a person through a browser or from a mobile client or a program of your API partners comes to the service and must receive the requested information quickly enough (the SLA — service level agreement — for the response time is established) and without a large number of server and program errors ( the SLA is set to an acceptable percentage of errors).

In the framework of modern functional testing, consisting of the stages of unit testing, FT, IT, ST, deployed by you or performed by the forces of engaged specialists (outstaff), you will make sure that the code corresponds to the TOR and has the required functionality, but “non-functional” the performance requirement will fail to be verified.

Decision:

This position is covered by load testing.

Load testing (ST) specialists will need a specialized stand and access to the infrastructure to analyze the system and prepare for the ST.

Solution method:
The NT team arrives at the project and begins its work:

  1. Clarifies system architecture.
  2. Collects statistics from operations and creates a load test profile.
  3. Discusses the methodology of load testing with you.
  4. Prepares data pools.
  5. Writes scripts.
  6. Configures monitoring.
  7. Conducts tests, registers defects.
  8. Conducts retests to correct defects.
  9. Prepares a report.
  10. … ??
  11. (Your) PROFIT.

Based on the results of these works, you get an answer to the question, will this system on a specific piece of hardware be able to withstand the flow of requests from N number of users who perform a certain number of given operations?

“Under the hood”:
At the profile preparation stage, a set of operations that will be performed in the system and emulate the load is discussed.

For example, for the “online sale of concert tickets” service, it can be:

  1. 30,000 operations per hour – see schedule.
  2. 20,000 transactions per hour – view ticket availability.
  3. 10,000 operations per hour – make a purchase, but do not pay for it, but close the browser.
  4. 2,000 operations per hour – complete a purchase and pay for it.

Where will NT get these numbers?

Two ways:

  1. Study of statistics (including historical data, taking into account seasonality and related peaks or lack thereof).
  2. The business knows/predicts a certain number of these transactions.

Before receiving a positive answer to the question of whether the system is holding the load, the NT may find performance defects that will need to be corrected.

Load testing is performed on a system that is already completely satisfactory to you from the point of view of functionality (“you already have everything working”), therefore, enough time should be allocated for “all testing”.

What can go wrong? Why all this expensive pain?

A short list of possible troubles:

  • In systems with a database under load, blocking may occur or the response time may increase due to suboptimal requests, a failed data storage structure, etc.
  • Web spheres may have suboptimal settings.
  • Kubernetes microservices can run out of resources.
  • The program can use all the RAM and crash with a nasty Out Of Memory Error.

What the user sees:

  • Denial of service (“the DB went down, threads on the web sphere ran out, there were numerous pod reboots in Kubernetes).
  • The response from the service comes not in 1 second, but in 10 (in dry language, exceeding the SLA for response time sounds harmless).
  • The server gives a 5xx error more often than necessary, or the application gives a lot of errors during operation (“exceeding the SLA by the percentage of errors”).

Moreover, up to a certain load level, the system lives and works successfully without such defects. The problems begin, let’s say, with a load level of 300% of the profile, and we know that sometimes (3 times a year) you have, for example, very cool concerts and the volume of sales increases by 5 times (500% of the average).

The performed NT will allow you to scale your service in advance, and instead of an ugly 5xx with an even less beautiful Java stack trace on the first page of your website, the company will continue to work and earn money.

Efficiency is easy to calculate:

Companies for whom it is unacceptably simple (it is very expensive) and who cannot afford to have their image suffer a lot, once and for all include NT in the testing process.

If the business is successful in the market, then the lack of service in the “hot” season will actually lead to “exiting the game”.

What’s next?

Our conventional ticketing site is performing well in terms of performance. If the business develops, the following processes continue:

a) marketers/businesses invent new interesting campaigns to attract customers;
b) marketers/businesses want to expand the functionality (add a few more buttons/simplify the transition to payment/add a few more steps to the transition to payment/branch the purchase process, etc.);
c) new requirements of state bodies appear (the verification must be implemented, i.e. a request to the external system has been added);
d) functional defects are received from users.

According to the results of a), b), c) and d), the developers are finalizing the functionality, testing passes all stages, including system testing.
Of course, it is necessary to make sure that the new version “holds” the current load (or will withstand the planned increase in load).

In other words, NT, by analogy with other types of testing, is a continuous process required by stressed systems.

Cases from real life

Task A

The solution regularly passes NT and works successfully in performance, but suddenly a reasoned decision is made to migrate to another architecture: more modern hardware with a newer version of the OS, which will be supported for another 10 years, unlike the current one.

The business is arranged in such a way that it is impossible to stop operations.

Decision:
We discuss the step-by-step migration scenario.
With the help of load testing tools, we apply the calculated load to the system.
We observe the behavior of the system on the test bench during the migration process.

Result:
According to the instructions developed in NT, the system successfully migrated from OS to OS and railway to railway, while users continued to use it without interruption.

In the course of operation, functional defects come from the product and there may be performance defects.

Based on the results of such incidents, we update the load testing methodology if necessary.

Task B

The external system implemented new requirements, updated itself and began to give answers of 1Mb size, instead of 10Kb, the system gives answers to clients in 10 seconds, instead of 1.

Decision:
We increase the size of responses on the test bench, reproduce “new behavior”, conduct full load testing, understand at what level of load the system meets the requirements for response time, add resources, check that a more powerful system can withstand the required level of load.

Result:
The system works in accordance with the set requirements in the new conditions.

Task B

The marketing department developed a new promotion: “Buy three vacuum cleaners for the price of four and get one free.” Who will refuse? Expensive advertising goes through all possible channels with coverage of the entire target audience. People come to the site en masse to shop.

That’s fine, but how do we know the system will handle the load and not crash?

After all, then the funds allocated for advertising will have to be recorded as an expense, the company’s profit can be forgotten, and the image will be spoiled once and for all: the same channels that advertised will be happy (and this time for free!) to tell you that the site crashed.

Decision:
We take information from marketing regarding the increase in workload for certain operations and conduct a stress test.

Result:
Your expectations are justified.

“You can’t come up with it on purpose” (cases from life)

Task G

Setting:

The employee is tasked with performing a boring operation: it is necessary to go through a long case with 5-6 screens several thousand times without errors in manual data entry.

Gloomily assessing the amount of work (a couple of days from morning to evening), a smart employee mentions that he saw “somewhat similar functionality” in another module of the system, and if you do it there, you can complete all this work in half an hour with breaks. Resourcefully? Wrong word! Imagination paints an instant upgrade from an operator to a programmer, or even an architect of a solution. However, something went wrong. Who already guessed what happened? And this is what happened: when you try to perform an operation from the wrong module from which it should be done according to the instructions, but when using a completely different one, a suboptimal request is generated in the database, which the technology did not plan to do (see TK), which leads to blockages. DBAs figured out the performance and hacked this query on the DB. The system returned to normal operating mode after a few hours of downtime.

Decision:
Operators must be able to work with the system.

Task D

Setting:

Without verification by NT, the archiving procedure is implemented on the database. It works functionally. Why, in terms of performance, check it under load, if a load test stand is assembled? It remained a mystery to us. In productivity, there was a lock on the DB, and the system did not work, in my opinion, for a day.

Decision:
Since the company had a load testing team, the issue was resolved as follows: they reproduced the defect on the NT stand and developed recommendations to prevent it from happening again. This is also part of the load testing work. And it is better to check such things at the NT stand than to stress support on performance (colleagues, we know that you usually have enough to worry about).

Instead of output

According to the results of development and testing (all stages of functional and NT), a product with known characteristics and instructions for use is obtained.

Remove any component of this complex, and the “organoleptic” properties of the product will change unpredictably.

Load testing is basically load simulation. In order for our model to better correspond to reality, it is necessary:

  1. The same database (only depersonalize personal data in it (yourself or order work from specialists).
  2. The same machine resources as the productive one.

Is it possible to conduct NT (and spoiler) how to do it, if points 1 and 2 are not fulfilled above, this is a topic for a separate article. Stay tuned.

Related posts