I absolutely believe in testing, but I don’t believe in absolutes (“Everything in moderation, including moderation itself” is one of my mantras.) I was recently asked how a team can know if it is testing the right stuff, so I’ve decided to share it here, in hopefully less of a tangential rant than it probably came out off the cuff.
Get Into The Testing Mindset
To test well, you first have to have a mindset focused on quality. You have to believe that writing tests is an important part of the practice of engineering, and that you will gain more from taking the time to write the tests than you will lose from not writing more functionality in the same time. If you don’t believe that, then there is a whole slew of other posts you can read if you want someone else to convince you of it.
You will never hear me demand 100% test coverage. It is a very noble goal, and if you can get to it, in the right way, then wonderful. But I have seen my fair share of useless and counterproductive tests that are written in the name of getting to 100% coverage. And in demanding 100% coverage, a team can get far off course towards high quality software that is the intention of that 100% coverage goal.
Coverage metrics is one of those places where we get bogged down in our numbers-driven world and we forget what those numbers are supposed to mean. Way more important than whether all of the lines of code have been executed is whether the test validated that the functionality was as desired. Or that upon fixing a bug, that class of bug will not resurface. Or that you can change the implementation with confidence that you will not introduce a new bug.
Since engineering is a discipline of weighing trade-offs, there will always be another force pushing you away from 100% test coverage. That is ok, as long as you decide what parts of your application absolutely must have good coverage. You will have to weigh the risks of a lack of testing against the risks of a lack of functionality. Sometimes the choice for a new feature is more important at that time than a higher level of test coverage. Such is the life of an engineering team to make those trade-offs.
tl;dr: Show Me Some Code
This is a lot of talk about philosophy, Jeff. Get to some code.
Below are a couple examples of poorly written code and their poorly written tests to give some context on writing quality software. Most of the issues illustrated would not exist in a statically typed language or with better immutability built in, but I won’t blame bad test writing on the language of choice. As you’ll see, the examples can be vastly improved in the language used. It is the responsibility of the engineer to think about what the test should do and what is important to make sure never breaks. It is that engineer’s responsibility to check that the test itself was working and not just jump from “green light” to “commit.”
Missing Pre- And Post-conditions
A portion of this class of testing could be ignored in a statically typed language, because the compiler would help, but it won’t catch all bad inputs and unexpected outputs. So, you still need to be aware of what is going on. Here is an egregiously bad piece of code and an even worse test for it.
function triple(number) {
return number + number + number;
}
The function is straightforward in its functionality, but in a language like Javascript, it has a lot of room for failure. Here is a 100% coverage test for it that is sorely lacking in value.
tap.test('triple', function (test) {
test.equal(triple(3), 9);
test.end();
});
Woo-hoo! We hit 100% coverage. But we didn’t check what happens when you call
the function with null
, or with no argument at all (triple()
), or with
a string. Part of the issue is that the function is not too well specified, but
if the function is meant to behave in certain ways for those failure cases, then
we need to exercise them in the tests so that other engineers know what to
expect and so that if we want the function to change behavior, we have
a baseline off of which to make that decisive change.
Check the data type, check the valid input ranges, check the output.
Mutable Parameters
This is another class of failure that would be eliminated in a different language (I’m looking at you Clojure and Haskell) or with very strict coding guidelines (“always use Immutable JS data structures”), but in a language as fluid as Javascript, you can shoot yourself in the foot without even knowing the gun was loaded. Here is a fairly degenerate case of a function and test, but it is a case I have seen, and when buried in a big test suite, it is hard to discern that the test is actually worthless.
function calculate_total(order) {
var price = order.price;
var tax_rate = 0.10;
var total = price * (1 + tax_rate);
order.total = total;
return order;
}
tap.test('calculate_total', function (test) {
var order = { price: 8, total: 8.64 };
var result = calculate_total(order);
test.equal(result.total, order.total);
test.end();
});
This function has a bug (or maybe the test does). The tax_rate
is set to
a value other than what the test expects. But, given how the test has set up
its data structures and its assertions, it will never fail.
You can look at this and say to yourself, “I would never write code that bad or a test that worthless,” and I hope that is true. But it happens, and you should be on the lookout for it.
How To Test Better
Here are a few ways you can write better tests.
Watch Your Tests Fail
One of the central tenants of test driven development is to write your tests first and then write code until they pass. That solves for a certain class of test failure, but it still leaves open the possibility of writing the bad tests above. Especially when the test writing seems to have gone too well, I like to change it to see if I can get it to fail. Perhaps I change an expected value to make sure I get a failure. Maybe I change an input to force a failure. Even these small moves can help build confidence that your test is doing its job. Be skeptical, even of the code you wrote yourself.
Automation
Tools like QuickCheck from Haskell can help with bounds checking and generating enough random inputs that you won’t have to think about the first example case too much. I’m sure there are similar tools that help in other ecosystems and other tools that can check the validity of your tests. Using them as much as you can will ease the burden of thinking through all possibilities on your own.
Future-Proof
My favorite tests to write are not so easy to give examples for. They are the ones that signal when future implementation changes have added an unexpected condition. Sometimes that is a bug, but sometimes that is the beginning of a talk about desired functionality. These tests are hard to write, because you have to think about how the code could change and write a test that will guard against it. You want someone to change code that breaks the test and then, hopefully, instead of just changing the test to pass, that person starts a conversation about the functionality. “I just broke a test in an unexpected way. Should we really be adding this new option?” That is value with a capital “V.”
Discipline
This is my current favorite answer, because it will always be useful, even when your toolset changes. Engineering is a discipline. Quality must be imbued through the practice. “Quality is job 1.” It is not just a feature to tack on at the end of your work when you think the functionality is complete. And of course, I don’t mean to imply that quality is only derived from good testing. Doing this job well requires doing it thoughtfully and being disciplined. Writing good tests is one of the tasks of the job that takes extra special attention.