The Golden Rule of Unit Testing

by Ville Laurikari on Tuesday, May 12, 2009

From the Wikipedia article on unit testing:

“In computer programming, unit testing is a software verification and validation method where the programmer gains confidence that individual units of source code are fit for use. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class.”

If you take this definition of unit testing literally you are going to be testing exactly the wrong things.

As an example, let’s assume we have a piece of code which is responsible for parsing some XML strings into data structures. This isn’t arbitrary XML, the strings are all of this form:

<user>
  <id>evalua</name>
  <name>Eva Lu Ator</name>
  <message>Anything in parentheses can be left out.</message>
</user>

The result from parsing should be a record in memory with the id, name, and message all parsed out nicely.  Let’s further assume that this code is implemented, foolishly, using regular expressions, because the person who implemented it thought that using an actual XML parser would be overkill for such a simple task. Sigh. At least the code is all in it’s own module with a clear signature:

structure Parser :> signature
   type result = {id : string, name : string, message: string}
   val parse : string -> result
end = struct
   <imagine a barrel full of hairy regex matching code here>
end

(That’s Standard ML, you insensitive clod.)

The final assumption is that the person who wrote this code, let’s call him Harry, had heard that unit testing is a good thing, and had decided to write some tests.  Our hapless Harry sought out the Wikipedia article to double check what unit tests actually are and proceeded to write tests for each and every function, plus lots of stubs so that he could successfully tests each function in perfect isolation.  The amount of test code Harry has written is about three times the amount of his hairy regex matching code.

Harry’s code works fine, until the first XML string with a multiline message comes up.  It wasn’t in the original requirements, so Harry didn’t originally add support for it because he had also read You Ain’t Gonna Need It.  Now he has to make a few adjustments to the code; maybe he splits one function into two and replaces another function with new code.  He makes little whimpering noises as he deletes the now defunct tests for the old code.  He curses under his breath when he writes the new test drivers and stubs for the new code.  But he manages to get it all done, and the code goes into production.

The holiday season arrives, and Harry flies off to a long-awaited vacation in Thailand.  While Harry is sipping piña coladas at the pool bar, his colleague, Ethan, gets an urgent bug to fix.  It turns out someone tried to put the ‘<’ and ‘>’ characters in the message, resulting in &lt; and &gt; in the XML, and Harry’s XML parser doesn’t handle them at all.  Ethan stares at Harry’s code for a while, deletes everything except the module signature, and writes some code to call a real XML parser and picks the needed parts straight from the DOM tree.

He writes a test program which parses example XML snippets and checks the results against known correct results.  He writes another test where the XML parser is a stub which fails in random ways (returns bad DOM trees, throws exceptions, etc.) to ensure his code can handle those situations.  Confetti starts streaming from the air conditioning vents, the birds outside burst into song, and a team of furiously cheerful tap-dancers dance along the corridors, joined by the office staff, all celebrating Ethan’s genius.

What did we learn from all this?  For one thing, unit tests won’t protect you against incompetent design. Second, blindly following rules will probably just get you in trouble. Most importantly, unit tests make the code rigid. It makes little sense to write heaps of testing code against interfaces which will be changed soon.  Harry spent a lot of time rewriting test code when he had to add features to his parser.  Even though I just said that you shouldn’t be just following rules, I will nevertheless introduce the golden rule of unit testing:

Test against interfaces which are not likely to change.

When the eventual refactoring is done, chances are that you’ll want to preserve some interface.  In Ethan’s case, he wasn’t so much refactoring but rewriting, but the module interface stayed the same.  Tests using only that module interface will help Ethan to test his new code.  Glass box tests designed for a particular implementation of the interface are not going to be helpful for Ethan.

If you’re writing software to go in a robotic spacecraft and you’re not exactly going to maintain and enhance that software after the launch because, to put it bluntly, you’d probably screw up and cause the $500m tin can to fly into the sun… In that kind of code everything can be unit tested, because nothing is going to be changed anyway, and it’s going to be worth it to gain a 0.1% improvement in the likelihood of catastrophic failure.  But in your regular software product which evolves all the time, along with the requirements, you’ll be wise to follow the golden rule and write tests only against those interfaces which don’t hopefully change all that often.

Oh, and please don’t ever parse XML using a heap of hairy regex code.

Related posts:

  1. The Essence of Lambda
  2. Programming Problems in Disguise

If you liked this, click here to receive new posts in a reader.
You should also follow me on Twitter here.

Comments on this entry are closed.

{ 8 comments }

gregK May 12, 2009 at 18:01

>Most importantly, unit tests make the code rigid. It makes little sense to write heaps of testing code against interfaces which will be changed soon. He bring out an important point. If you are afraid to fix your code because you don’t want to rewrite your unit tests then something went terribly wrong. What he is describing is making your new code legacy before it is even released because there are many dependencies from test cases. I say don’t be afraid to scrap obsolete test cases. Also I agree don’t over do it, if you are still figuring out what you want. Get your ideas straight first then focus on robustness via massive tests.

This comment was originally posted on Reddit

vlaurika May 12, 2009 at 18:13

> What he is describing is making your new code legacy before it is even released because there are many dependencies from test cases. The point is that you should aim to get your test coverage by testing on a level which is the least likely to lead to a lot of obsolete test code after a potential future refactoring. That kind of test code is *helpful* when refactoring, instead of merely obsolete. Testing each function in perfect isolation, as per the Wikipedia definition of unit testing, does not help you when you refactor the code.

This comment was originally posted on Reddit

Justice May 12, 2009 at 20:15

It’s a long way of saying: write unit tests public interfaces and contractual behaviors; don’t write unit tests for private implementation details or for simple getters and setters.

Harry should have written a few unit tests for the module’s public interface (the single function ‘parse’), testing various scenarios ranging from simple to complex. He should not have written unit tests for the particular way he implemented the interface. When he added in behavior to handle newlines, he should have written some unit tests to check that the newlines bug does not will not reappear later as a regression. When Ethan ripped out the regexes and replaced them with a parser, he should have written some tests specifying that the module can handle character-entities too, thus protecting against a future reappearance as a regression of the character-entities bug.

Of course, while developing a module, one may want to include some test-driver code to check basic parts of the implementation. But these test-drivers are not part of the unit-test suite. They should not be run automatically. They are there only to ease development, and may be deleted at will (because you are using source control, right?).

Unit tests are a way of answering, in precise a way as you are willing to write code for, the question: “What is this public interface supposed to be doing, and does it actually do that?” Unit tests should not answer the question: “How does this implementation work under the hood?”

Ville Laurikari May 12, 2009 at 20:30

Justice, I completely agree. What prompted me to write this post was the common misunderstanding that a “unit test” is supposed to test the smallest possible unit (a function or method) in isolation. If it’s testing more than that, it’s not a “unit test” but something else.

In my world, a “unit test” is almost always testing the public interface. This appears to also be the viewpoint of the software industry in general, but every now and then I run into trouble with this terminology.

So, before one goes very deep into an argument about “unit testing” with someone, it may be useful to check if you’re even talking about the same things…

klaar May 12, 2009 at 23:31

Does he mean that i should optimize my test coverage to only cover the most important components at an early stage? Very clever.

This comment was originally posted on Reddit

Brian Lavender May 13, 2009 at 11:15

What about pre, post, and invariant conditions like in Eiffel? Seems that design by contract can build a lot of confidence into the code. I have only briefly played with it, but assuming your code is not influenced or produces side effects, you can prove it correct. I don’t have a lot of experience with it, but I thought I would throw that out there.

brian

Ville Laurikari May 13, 2009 at 20:56

I certainly use asserts and other similar tools to check invariants at run time, but I’ve never followed the DbC dogma to the point of writing the contracts before writing the code; this may be just because I haven’t used Eiffel. But I have written unit tests before writing the implementation(s), which is kind of the same thing but with a different twist.

As for proving correctness… I’m not a believer, at least not for widespread use in “everyday programming” by the masses. Then again, I also believe that the masses should not be writing computer programs in the first place.

Brian Lavender May 14, 2009 at 12:38

I have to apologize. I was being sort of esoteric. When you mentioned sml, it made me think of my programming principles class at Sac State, where we used Bertrand Meyer’s book “Introduction to the Theory of Programming Languages”. which in turn made me think of his language Eiffel and Design by Contract. Honestly, I haven’t used DbC much in practice, so I can’t speak from real experience. I know that DbC is now in Java, .net, and a number of other areas.

Certainly, you make a good point with unit testing.

Additional comments powered by BackType

Previous post:

Next post: