Why you should write automated tests

All posts in this series:

This post: Why You Should Write Automated Tests
Anatomy of a Good Test
Tests Granularity
Test Strategy
Tests Organization and Naming

In many projects we’ve seen, writing automated tests is a task that nobody really enjoys. It’s common sense though: to be a good developer you have to write tests. It might even be useful sometime in the future, eventually preventing some bugs. So we should do it. But it tends to be thankless work that can become quite tedious.

A common coping mechanism is to postpone the task to the very end of the feature development. “The feature is done, everything works fine. The only thing we have to do is to write some tests.” is a statement we heard in some projects. It’s no surprise that many project management people see automated tests as some low priority, technical tasks, that might also be done later, when there’s some spare time. From their perspective development time is precious and should better be invested in new features. Even with those automated tests being green there’s no guarantee on the functional quality of the software. Tests are written by the developers, usually focusing on some technical stuff. The functionality has to be tested manually anyway. So, why should we waste time writing tests?

Of course, there’s no project out there that fully matches the above description, or would you know one? Automated tests clearly have their benefits though. They might be bigger or smaller, depending on the quality of the tests. In the worst case, there are even a few tests within a test suite that reduce the overall benefit. But, in the end, no project would want to live without any automated test. For this post, let’s assume those automated tests are properly designed and written. We will come up with an explanation of what this means, from our point of view, in a follow-up post.

Although it’s quite clear that writing automated tests is a good idea, it’s still important to know their benefits. Maybe, by changing the way how or when tests are written, those benefits could be increased. Or, the other way round, some of those benefits may not be necessary in the specific project context. In the end, you don’t want to write tests just because “everybody should do it”, but because they provide some value, help you, your team and your company in some way.

Prevent regressions

If automated tests are written as the very last step of the development process, even after having manually tested the feature, the main benefit they provide is to prevent future regression. This is where automated tests shine. If every (relevant) functionality of the application is verified by an automated test, you can be sure of any change you do. The moment you break something by accident, at least one test will turn red and inform you about the problem. Looking at automated tests from this perspective, they can be seen as a kind of insurance. Something you invest right now to prevent some damage in the future.

If we think about the real benefit of this regression prevention we have to take a few things into account.

How complex is the code under test?
How often will it change in the future?
How critical is it for the functionality of the overall system?

If, for example, a piece of code is very complex, will probably change very often, and every contained bug will lead to an unusable system, then it’s obviously a very good idea to spend some hours or even days in writing a comprehensive suite of tests for it. Fortunately, the number of those components in most projects is quite low (if not even zero). On the other hand, think about a completely trivial component, with a low (or even medium) criticality, that will only seldomly change. Maybe it’s fine if we don’t write any particular tests for this one at all. If it’s embedded in some critical processes it might be implicitly tested by some more integrative tests, anyway. A Plain-Old-Java-Bean is a good example of such a component.

Unfortunately, there is no golden rule to tell you if you should write a test for a given functionality. You need to use common sense to decide, based on your concrete project context. You could try to assess the risk for your system. Risk is usually calculated as a multiplication of a probability of error happening times the damage it could cause. One good tool that helps here is a risk matrix. Even a simple 3x3 matrix with “low”, “medium” and “high” values on both “probability” and “severity” axes can help you estimate. You can also ask yourself a few questions. If my system misbehaves, will it kill somebody (see the case of Therac-25)? Will my company lose millions of Dollars (ever heard of Knight Capital Group)? Will the colors on the company web page be correct? Will the users get their answers one second later? The cost of fixing the problem also needs to be taken into consideration. And it can be that no bugs can be fixed after the release or that it would be very costly to do so because, for example, your software controls Toyota cars.

Verifying correctness

When working on some piece of code, the most common question is: is it working as expected? In theory, the correctness can either be formally verified or it can be tested. The formal approach is extremely costly and therefore very rarely used. The testing approach is far more affordable in terms of both cost and time. The piece of code can be executed with various inputs and the expected outputs can be verified. This verification has to be done from different perspectives. For example, are all the new or changed pieces connected correctly, and does the whole component work as expected, including the new behavior. Maybe it’s also necessary to verify that the performance was not negatively affected.

All those tests can be done in various ways. They can be executed manually just by starting the application, stimulating it in a certain way, and verifying that it responded as expected. That can be done by the developer itself, or by someone else, e.g. a fellow developer or someone in an explicit QA Role.

Of course, the preferred way would be to write some automated tests. Writing those tests might take longer than manual testing, but it will prevent all people working on this application from manually testing the same behavior over and over again. Manually testing every change will quickly sum up to more effort than implementing these tests once. Moreover, it might be impossible or at least very hard to test all the conditions and edge cases manually. With automated tests, any possible combination of inputs can be verified. The ZOMBIES approach might give hints on what to verify.

Such an automated test suite would also be able to tell if it’s safe to release. If all required functionality is covered with automated tests and all those tests are passing - you’re done. You can push the application to production. You might want to refactor and clean up a few things, but as long as the tests are green, you can release any time. That, of course, requires having good tests for everything relevant in your application, but that’s the goal, anyway.

This aspect of the tests can and should also be used when fixing bugs. The best way of doing that is first writing a test that shows the presence of an issue, fixing it, and then re-running all the tests and seeing them green. Such tests help to clarify how the system should behave in addition to preventing the same bugs reappearing. Also, while adding such a test you might think of some other corner case that wasn’t covered by now. Maybe you could fix more than one bug in one go?

Risk-free refactoring

When we think about potential future changes in a component, we should not forget the immediate future. The code you have written in the first run is usually not the code you deploy to production. After developing a feature and looking at the code again, there will most probably be something that can be simplified or made cleaner. Maybe you don’t identify those places yourself. Maybe you don’t even look at your code again before pushing it. But there might be a colleague who does a review and comes up with some improvements. Having a comprehensive set of tests, code refactorings can be done without risk. In many cases, it can even be done ‘on-the-fly’ along with feature development. Without tests, that’s rarely possible and a potential refactoring is often skipped because it would take too much time.

Strictly speaking, providing a risk-free refactoring is also a kind of regression prevention. But, as regression is usually understood as something that happens sometime after the feature development, for example by accident when implementing a new feature, it’s maybe a good idea to think about it separately.

To blur this boundary even more, a potential refactoring might also be identified only long after the feature development. In this case, it might be a good idea to write some missing tests for exactly this reason: to allow a risk-free refactoring of the component. Once written, the tests will usually stay in the test suite and prevent regressions in the future, too. But the original reason and the main benefit, in this case, was helping with the refactoring.

Documentation

A very important (and often unrecognized) aspect of having automated tests is that they document the behavior of the system and its components. What is good about this documentation is that it cannot get outdated. If there’s any misalignment between the tests and the production code, you’ll notice it and you’ll need to fix one or the other. This documentation serves multiple purposes.

The tests show the expected behavior of the system and its pieces. If we need to know how the application will behave in a certain situation, we can take a look at the test cases and deduce. If we cannot find a corresponding example, we can write a new test, both, to find out how the application behaves, and to extend the documentation. It would be a perfect example of documentation that shows how the application can be used. Properly written tests also verify the behavior and ignore implementation details, so such documentation ideally will show only desired behavior and skip unimportant things.

Acceptance tests are a very good example of documentation. Of course, it’s not their primary function, but they’re doing it anyway - they describe how the application should behave to satisfy the customer. This is, by far, the most important thing we need to verify before releasing the application. It would be best if those tests were automated, but even if they’re not, they’re still part of the documentation of the application.

A related idea is “specification by example”. Sometimes it is easier, for both writer and reader, to specify the behavior of the application using examples rather than abstract descriptions. Besides, that way is far easier to automate, you just need to extract values from examples, execute the application or specific part of it and verify that the output is as expected. Of course, the examples need to be exhaustive enough to cover all desired behavior, but you’d anyway need some examples before implementing anything new or changing existing code.

Another aspect of such documentation is showing how to do certain things. This is mostly relevant to all developers joining the project. If you don’t know how to create an object - find a test that does exactly that. If you want to know which parameters are required for a method call - there will be tests somewhere that show that. If you’d like to know the conventions used - read through the tests and you’ll know. If you’re wondering what kind of responsibility a class or a function has - the tests will tell you all about it. And this aspect of automated tests cannot be overestimated. Imagine coming back to a codebase you haven’t seen for a few months. You’ll be thanking yourself for writing all those automated tests. You could, of course, argue that all those things can also be found in the production code. In practice finding them in the tests might be easier, because the tests should be focusing on exactly one thing, like validating input parameters to a method call. In production code, such activity would be executed in the context of some bigger operation and it could be difficult to tell which things are influencing the aspect you’re interested in.

Tests might also refer to things that are not that visible in the production code. You might have used some clever trick or a side-effect of a 3rd party library to achieve your goal, but in the production code, it might look like an implementation detail. For example, some external function you’re calling always returns items in a specific order and you use the fact to your advantage. But the authors of the function might decide at some point that the order will change or will not be guaranteed any more. If you did yourself a favor and wrote a test that verifies that your piece of code works as intended and now it fails, you’ll know where to look and what to fix. Apart from the documentation aspect, it is also preventing regressions so such tests provide even more value.

All the above examples assume tests written “in code”, using the programming language of your choosing. You could go even further and try to write your tests in a human-readable language using tools like Cucumber. That could improve your capability of documenting your application. As already pointed out, that’s not strictly necessary, even tests written in computer language can and should serve as a documentation. We would even argue that some high-level tests when well written in code, could be readable by non-developers.

What’s next?

We hopefully convinced you (if necessary) that it’s worth writing automated tests and it’s worth doing it well to gain as much benefit as possible. But we merely scratched the surface, we will be posting more about automated testing. The topic is far too important, but at the same time not that easy to master. You wouldn’t believe how we ourselves are struggling with it. But having already some experience we want to share with you what we’ve learned so far. Do expect more from us, for example:

what are the qualities of a good test,
how to generate test data,
what is the relationship between tests and design,
common mistakes and how to avoid them,
and more

TL;DR

A well-written test can provide multiple benefits. During the implementation of a feature, the according tests verify the correctness. After the development, they allow for safe refactoring and clean up. The same is true in case of changes because of new requirements. Tests help to understand the components of the system and especially the implications of future changes by documenting the current behavior. Automated tests save time (and in consequence money) throughout the whole software development and maintenance process.

Many thanks to Eberhard Wolff, Joachim Praetorius, and other colleagues for their feedback and suggestions to improve this post.

Header Photo by Mitchell Griest on Unsplash