All You Need To Know About Testing in Production

Gurpreet Kaur

Technology

Published on 12 Feb, 2021

15 min read

How is greatness achieved? You can’t just become a legend overnight, there is a lot of effort that goes before that point. There are times of disappointment, there are times of regret, there are times when quitting seems like a good option and there are certainly a lot of times you will fail. Carrying on despite these odds is when greatness comes knocking on your doors.

And who doesn’t want to be great?

I most definitely do. In my quest for being great, successful and brilliant, I have realised one thing, and that is trial and error can take you a long way. Like I said above, there is a lot of effort that goes into being great and that effort equates to trial and error. You may not be great at the first attempt or even the hundredth.

So, how do you ensure that you are?

If I were to talk about my field, which is web development and there are hundreds and thousands of great and legendary apps and sites out there, I’d say trial and errors is the secret. Persevering until you succeed and testing is one fine way to do that.

And that is our topic for discussion today, achieving greatness in the web world through testing. Since testing is a mammoth topic to cover in just one blog, I’ll be focusing on testing in production and how it has basically revolutionised the way we build apps.

What is Testing in Production?

Building apps is by no means easy and it certainly should not be rushed. You can do it, but then it’d only be a fool’s errand. Testing has become a major part of it, you can’t build an app or a site without thoroughly testing its efficiency. That won’t be fair to the audience. You want to build their trust and loyalty and testing the application and making it ideal for your users is the only way to go.

From development, staging and pre-production, testing has now transcended the building process and reached the users and their first interaction with the app. And that is what testing in production actually is.

It is a software development practice, wherein live user traffic gets to experience the new code changes instead of a staging environment. Of course you are not going to send out a pre-tested application to the entirety of your audience, that would be foolish. When you test in production, you only send the trial app to a fraction of the live user traffic to be sure of its viability.

Now, you would be thinking, why take the risk? Why roll out an app, you are not entirely sure about? The answer is simple. You can never be sure of your projects until it finally goes on the floor. Testing in production is like a mock test before the actual one, it prepares you to counter all the bugs that you may have missed in the development and staging tests.

Yes, there is a chance that your users may experience a software that is not up to the standard and buggy to say the least. However, isn’t it better that a small proportion of your user base experiences that instead of all of them? I’m sure it is and that is why TIP is important. Tech giants like Google, Netflix and Amazon have been known to release new features to a few of their users, measure the impact and then go on with the final release. If TIP seems to work for them, I think it should work for us too.

However, it is not always rainbows and butterflies with ‘testing in production.’ For one, it is not a substitute for pre-production tests, rather it complements them. Secondly, it isn’t easy or fast, you can’t just snap your fingers and be done with it. It requires a great deal of automation and you ought to know the ABCs of the best practices and designing a system from scratch.

Testing in Production has proven to be extremely essential, but only when it is done the right way.

Why should you Test in Production?

There is question mark in the centre and the reasons for testing in production are written around it in circles.

Now that you know what Testing in Production is, the next logical question to ask is why to take it up? The answer is also logical enough, when you test during the production phase, you are going to reap a lot more out of the process than you would in any pre-production tests.

Let us find out how.

Provides enhanced accuracy in performance evaluation

The paramount benefit of testing in production is the accuracy it provides in its evaluations of performance. You could have a perfect software on paper and even while staging, however, the real-life could have a different take on it. When you make the decision of testing in production, you will not only be able to identify the flaws of your software, but also get to evaluate its performance in real-time, making your evaluations on point and immensely accurate.

Provides enhanced feedback on beta program

There are always going to be beta programs before the actual one comes along. Implementing TIP at the beta stage gives you a clearer picture of the software’s development with its instant feedback as to how the users are interacting with the program, ultimately leading to faster development.

Provides enhanced effectiveness in evaluations of user experiences

Before building an application, it is important to do proper research and that is dependent on your users. So, if the entire development process is user-centric, that means users are pivotal for the program. Testing in Production allows you to take capitalise on your user base and learn from that. Releasing a software to a section of your audience will allow you to get objective information about their experience, thereafter you can work to enhance it as you move forward. There isn’t any other test that would help you get this crucial piece of information.

Provides enhanced problem detection abilities

You could run as many tests as you want, still you would not be able to identify and rectify all the problems with your software. When you run an app on the production floor, you actually get to know its real-time impact on the users. Any problems and issues faced there would not have been identified beforehand. So, at the end of it all, TIP provides your developers the superpower of foreseeing the future and being a step ahead of the nefarious bugs that accompany development every step of the way.

Provides enhanced abilities to deal with unpredictabilities

Can you know what difficulties you might face while climbing Everest until you actually do it? That journey is always going to be unpredictable and you could only endure it, when you actually do it. The same is true for testing in production. There are going to be many unpredictable scenarios when a software is released and those can only be identified when you actually make it to the live audience. With TIP, you get to monitor real-time performance and deal with the issues there and then.

Provides an enhanced potential to deploy developments

Earlier when an app was built and released, the work seemed to be done. However, that just marks the beginning of the developer’s tasks. It doesn’t take months to release a new feature on the previous project. Deployments of developments are easy, fast and pretty efficient when you test in production. This is because detecting problems and bugs in the deployments is easy with TIP. So, changing consumer demands should not deter you in the least.

These benefits also effectively answer the question, why is testing in production important.

Can Testing in Production pose any challenges?

With these many benefits, you are bound to think that TIP is the go to answer to all your development problems, it very well could be. However, TIP can also be somewhat challenging to implement. There are a number of problems and drawbacks that accompany it and continuing without letting you know about those is simply unfair.

So, let's get some insights on how challenging testing in production can be.

The risk of failure

The foremost challenge that is obvious in TIP is that of failing your audience. Analysing this risk and working around it can become problematic for many sites and software. This is because even though you are going to be deploying the software to an audience, that software is bound to have problems, it isn’t whole as of then. What this means is that those problems can disappoint your users, and if you have software that works on financial transactions, you might have a much bigger risk to overcome.

The risk of security

There is also the risk of security of data. Because testing in production means real-user audience and real user data, you’ll be way past the dummy stage then. So, if your software by any means does not comply with the data protection guidelines, even if it is due to a bug, you will have to face some serious consequences. If your software has to abide with the HIPAA guidelines and there is a data breach, the costs you may have to pay would be colossal. All of you must remember the Facebook debacle.

You can limit the data access points to avoid a breach, still it could happen because your software would be in testing.

The risk of the setup being amateur

For TIP to work properly, you need an elaborated setup. Your deployment has to be mature enough to be tested in a live audience and then there is the fact that you must be competent enough to perform an automated test, a manual TIP would never work.

Ask yourself,

Do you have the capability of dealing with the consequences of bad data manually?
Can you counter the side effects of the release by using feature flags or changing the functionality entirely?

So, if you aren’t sure that your setup is ready, chances of something going terribly wrong are pretty high and you would certainly not want that.

How can you Test in Production?

The different ways of testing in production are written in points.

Knowing the good and the bad of a concept is bound to give you a semblance of expertise on it. Since we have done that with Testing in Production, now comes the part of actually doing the grunt work and that is its implementation; how do you test in production.

Well, there are a lot of ways of doing it and I’ll tell you about all of them.

Feature Flags

You might have noticed that TIP is often considered to be associated with feature flags, that is because for the former to work properly, the latter’s presence is mandatory. Let me tell you why. Feature flags or toggles, whatever you want to call them, are basically lines of code that monitor whether any given feature should be activated or not. You can do this by creating a configuration that will implement them. Doing so will give you the power to turn on or off any feature during tIP or otherwise, and you won’t have the need to write any sort of code.

A/B testing

The use of a/b here refers to two versions of the software being deployed simultaneously. This is done so that you know the preference of your target audience and the way they pivot would be the version you can go through with.

Let’s take the example of a shopping site, imagine they added a new feature to track your orders, but there were two ways to go through with it, in such a scenario, a/b testing becomes a saviour.

There is also the fact that a/b testing need not just have to be done with two new versions. You can simply perform the test with the newer version as opposed to the older. And then gauge the inclination of your audience in real time. A/b can only be performed with real users and its crucial feedback has always benefitted the stakeholders and developers.

Canary testing

Canary testing is somewhat similar to a/b testing, in the way that it allows you to release a new version of your software to a small proportion of your user base. Then if you think that it is working like you wished and turning out to be a hit, you can go on and release the same to your entire user base. Canary release gives you the confidence you need before the deployment of a full scale site or app.

Controlled flight test

This one is a little different from the two previously mentioned tests. When you perform a controlled flight test, you are not going to be testing for system updates, rather it is the user reactions that you would be looking for. The purpose here is to test whether the users respond to the changes the way it was expected. For the same, you would select a group of real users and closely monitor and assess their behaviour. If your expectations and the test results align, you would know you are on the right track.

Automated rollback

Automated rollback means exactly what it states, which is rolling the site back to its original state automatically. This is done only when there is a flaw detected while implementing the release monitoring phase. This strategy has the potential of effectively hiding any and all mishaps from the users. However, it can also lead to data loss, if you don’t implement it properly. More about test automation and the common failures here.

Fault injection testing

You might think that you have created the perfect system to handle any kind of a problem and you want to confirm that. Fault injection testing will help you in that. You can create a problem in the live production environment and see whether your software is a masterpiece that can handle it or not. You will be able to specifically test for the errors and mishaps you are the most worried about.
It is important to know that fault injection testing mostly coincides with recovery testing, which handles the processes that cannot be tested in a live environment.

Continuous monitoring

Finally, testing in production needs to be continually monitored. Once the software is deployed, you would be able to identify a number of issues which could not be identified beforehand, simply because of a smaller data set and less traffic on the site. Problems like slow loading pages become obvious when you perform continuous monitoring of the production environment. Read about the culture of continuous improvement to know more.

Coming to the implementation of TIP

Till now you have known almost everything about Testing in Production, you know the meaning, the benefits, the drawbacks and the ways to perform it. Is there anything left to say or ask? Yes, there is and it is a rather important question to ask, and that where does it all begin? Where do you start?

Do you just implement all the tests and strategies mentioned above at once, or do you do them one at a time, or do you just select one and forget the rest? And when is the right time? How can you be sure?

You can't, that is why, I am not going just yet; at least not until I tell you all about the implementation plan; the stages of testing in production. I can’t leave you hanging, can I? So here we go.

The beginning

The beginning marks the first month of the implementation plan. This is the stage that works on the alignment of the project, wherein all the details for testing in production are worked out for the team.

The main agenda are;

To ensure the automation framework in place is easy to use and implement, without being a roadblock and with end-to-end tests being easy to write..
To ensure that the feature flagging tool is administered properly by focusing on the set up of SSO, permissions, user creation and user maintenance, so that you can ready yourself for the implementation of the SDKs.
To ensure you have a benchmark in place based on the gathered baseline metrics. These include everything from release time to load time and percentage of bugs. You would need these to know where you stand and make consequent improvements.

The progression

The next stage of the plan, the progression, is taking all the work from the previous step forward.

The agenda for this stage include;

Mirroring your current environment setup. Every designation from your software development lifecycle should have a reflection in that.
Adding teams to the whitelist of each environment through the same section in the feature flag configuration. For instance, the production environment will have the product team, that is if it is validating features after releasing to production.
Differentiating test data from real data during production. Setting a boolean in will help you in that. You can also create a separate database for real users in the testing phase to avoid any confusion while making business decisions.
Ensuring that the team knows that until the tests are still running, you cannot consider a feature or a software as done.

The finale

Then comes the finale, wherein the actual fun is done. Hey, that rhymes!

The agenda is quite clear here.

To deploy the first feature in the production environment, but to only a selected number of users, through a feature flag.
To run the automation scripts in production as well as to ensure that the previous features are working efficiently alongside.
To then disable the feature flag. With your team now having access to the released feature, you can resolve any bugs and functionality issues without impacting the end user by any means.

This would give you the confidence to release the feature to 1% of the user base through a Canary release we talked about before. You would keep on testing and keep on rectifying any issue you may face. This way you would keep on gaining more confidence and keep on increasing the percentage of users it would be released to until that percentage is the entirety of your user base. It usually takes about 90-days of testing.

Read about the importance of QA and agile testing to know more.

Conclusion

Now, there is truly nothing left for me to say, expect one thing. And that is retrospection. If you are a developer, you are going to keep building new features and apps throughout your life. So, it would be wise to know what you did correctly and what went wrong.

Therefore, when you are done with testing in production, don’t just sit back and relax, rather start tracing all your steps and see where what worked and what didn’t and which are the parts that still require improvements. Document everything, so that you would have a record to look back at, when the same feature needs to be upgraded again. A developer’s work never ends and this just proves that.