The world of online business is replete with stories of websites that managed to get a ton of traffic and then couldn’t handle the load.
Some sites recover, but by and large once a site is down, it usually requires manual intervention to get it back into service. This can cost a company a lot of business, as site failures often happen without warning and at times of the day and week when qualified technicians aren’t available to restore service.
The solution to this problem is to load test your site. This process is really no different than any other kind of software testing. You simply need to operate the site under circumstances designed to simulate real world conditions as closely as possible.
Here are some tips to make sure your testing process is as effective as it can be.
1. Simulate At Least One Failure
The rule in any engineering project is that failure should happen as early and as loudly as possible. Silent failure that happens someday is the bane of any project that depends on precision, technology or intricate engineering.
The information you gather from a failure will expose weaknesses in your system. Once you know where the system is prone to failure, you can effect repairs intelligently and mitigate against future issues.
Failure simulation can provide many different kinds of metrics. Each will give you an idea of the limits of your hardware and configuration. The more that can be tested, the more you can harden your system.
In order to harden a system and run the appropriate number of tests, it’s good to use a tool that can simulate a large number of users like LoadView.
2. Focus On Your Failover
A failover system is one that is designed to respond if a main system triggers one or more performance thresholds. Failover servers can be aligned in series, so if site A fails, site B can pick up the new traffic. If site B fails, site C comes on-line and so forth.
In the above example, if the number of requests to a particular domain or software instance reaches a certain proportion of its limit, all new requests can be routed to site B. It is vitally important that you make certain all of these thresholds and their routing setups work. Further, you need to make sure they work even if the site and all its failovers are under maximum load.
3. Test In Production Only
If you are developing web servers inside your own network, it can be very easy to assume everything is working fine. On a local network, latency will be nearly non-existent, no outside routers will touch your packets and things can often look like they are running smoothly when in reality the introduction of real-world conditions can immediately cause problems.
The quality of your web servers matters too. For example, if you’re using low quality business hosting to host your site, you may find that your servers have a high degree of latency, according to Web Hosting Buddy.
A common problem that arises when a test platform is moved “outdoors” so to speak, is that all the failure prevention software triggers at the same time because it interprets the network slowdown as a high load event. Suddenly the entire network is competing for attention and causing the kind of outage those systems were supposed to prevent.
Setting up a test environment in-house is fine, but in order to get the right data under the right conditions, you have to put your aircraft in the air. If it crashes, at least you’ll be aware of the problem well ahead of time.
4. Start On The Metal
Like any software project, a load testing process has to be operational at every integration phase. This means only the basic software can be running during your first test. Once you have guaranteed your base system is working, then you can add in plug-ins, add-ons, filters and so forth. Each one should be isolated and tested on its own.
The reason this is so important is because integrated software has to degrade gracefully. If one of those plug-ins fails, it can’t bring the entire system down with it. The base system has to be solid so it can continue running even if its auxiliary features are down.
Naturally, your failover mechanisms should trigger on any kind of software outage. The easiest way to set this up is to create a polling mechanism so the base system can regularly ping the auxiliary systems to make sure they are up.
Without exception the most important part of any testing regimen is to document your results. This information not only gives you a vital record of what you’ve accomplished, but it also serves as a starting point for any investigation into why the system may have failed or triggered a failover. Think of your documentation as a map. Without it, you’ll have no idea where to look if an inexplicable problem arises.
Truth be told, software development is a lot like fiction writing. Testing and editing often take longer than writing the work in the first place.
Untested software leading to failures and loss of business, however, is too big a risk to take with your hard-won customers and clients.