Techniques for stubbing out dependencies when unit testing Java code (part 1/2)

This is the first of a two-part series talking about how to stub out dependencies on other services when writing unit tests for Java code.

Setup. Suppose you’re a developer working on a large, old piece of software. Like most real-world software that has been around for a while, your application is buggy; it’s had lots of developers working on it; the code is not structured well or consistently; and the code is not covered adequately by unit tests, or perhaps it has no unit tests at all. Despite the code’s problems, there is not a good business case for spending several years and millions of dollars to do a big bang overhaul/rewrite. (In my experience, management almost never approves rewrites. And, that’s probably a good thing. See Joel Spolsky’s famous article on rewriting software.)

Suppose you need to add new functionality to the system. Your new code will rely on several services provided by the legacy system. For illustrative purposes, suppose your code will make use of an existing service called ThingyManager. This service contains methods for creating, modifying, deleting, and querying Thingy objects. Because the existing code is poorly written, both ThingyManager and Thingy have deep tentacles stretching through the rest of the code.

Being a conscientious developer, you want to write exhaustive unit tests for your new code. But how?

Ideal Situation. In an ideal world, ThingyManager and Thingy would be interfaces such as the following:

public interface ThingyManager {

Thingy getThingyByName(String name) throws Exception;

void modifyThingy(Thingy thingy) throws Exception;

}

public interface Thingy {

String getName();

List<Thingy> getDependentThingies();

}

The actual services would be implemented by classes ThingyManagerImpl and ThingyImpl, which of course implement ThingyManager and Thingy, respectively.

Your new code should be written against the interfaces rather than the concrete classes. You’ll also need to use some sort of dependency injection/inversion of control to provide your code with concrete instantiations of the interfaces. (See this article for information on dependency injection.) For example,

public class YourNewFunctionality {

private ThingyManager thingyManager;

public YourNewFunctionality(ThingyManager thingyManager) {

this.thingyManager = thingyManager;

}

public List<String> getDependentThingyNames(String thingyName) throws Exception {

List<String> result = new ArrayList<String>();

Thingy thingy = thingyManager.getThingyByName(thingyName);

if (thingy != null) {

for (Thingy depThingy: thingy.getDependentThingies()) {

result.add(depThingy.getName());

}

}

return result;

}

}

Of course, your new code should be written against an interface as well, but we will skip that here for brevity.

Your code now has no idea how ThingyManager and Thingy are actually implemented. So, you can simply write DummyThingyManager and DummyThingy classes that implement the interfaces, and you can use these dummy classes in your tests. That is, you can test your new code in isolation from the rest of the system. This isolation provides two benefits:

  1. Improved test performance. Since ThingyManagerImpl has deep tentacles throughout the code, if it is used in your test code, you will likely need to start the entire system. That’s very time consuming, especially when you consider that you probably want to start and stop the entire system as well as reset the system to an empty state with each test. If you don’t reset, start, and stop the system with each test, you’ll build up a web of dependencies among your tests where one test depends on the results of another test; that’s a recipe for disaster. But, DummyThingyManager has no tentacles into the code at all. By using it, you don’t have to start the system at all. All your tests have to start are YourNewFunctionality and DummyThingyManager. This could literally mean the difference between tests that run in a fraction of a second and tests that run in a minute or more. Multiply that by the 20 tests you plan to write for your new code, and you’ve got a huge time savings.
  2. Simplification of test setup. Starting up an entire system is difficult. It involves database configuration, setting up configuration files, setting up users, and so on. The setup is likely to be so complex that you’ll need to create Ant files or script files to perform it for you. But that means that you can no longer launch your tests from within your IDE. For example, Eclipse allows you to right-click on a single test method within a class and launch it in the debugger. But you can no longer take advantage of this functionality if you have to launch your tests from a script. But, starting up just your new functionality and the DummyThingyManager is easy, and allows you to take advantage of all of your IDE’s capabilities.

Despite these advantages, there is one drawback to coding against interfaces and stubbing out those interfaces for testing:

Programming against interfaces is tedious. Every time you add a new method to ThingyManagerImpl, you must also add the method to ThingyManager and DummyThingyManager. In fact, as more and more test suites are built up, you may find yourself with lots of stubbed out implementations of ThingyManager. All of those must be updated with each new method. Furthermore, suppose ThingyManager contains 100 methods (remember, we said the base system was poorly written), but your new code may only need to use 5 of the methods. You still have to stub out all 100 methods in your DummyThingyManager. I’ll discuss methods for dealing with these problems in part 2 of this series.

What if ThingyManager and Thingy are concrete classes? Remembering that our legacy code is poorly written, what if ThingyManager and Thingy are concrete classes rather than interfaces? That is, what if there are no existing interfaces to write your code against? Are you forced to bring up the whole system to test your new functionality?

Luckily, there’s a simple refactoring you can perform to get things on track. First, rename ThingyManager and Thingy to ThingyManagerImpl and ThingyImpl. Good IDE’s make renaming classes easy. The manager class now looks like this:

public class ThingyManagerImpl {

public ThingyImpl getThingyByName(String name) throws Exception {

}

public void modifyThingy(ThingyImpl thingy) throws Exception {

}

}

Next, build a Thingy interface that contains all of ThingyImpl’s public methods. Wherever a method takes a ThingyImpl as a parameter or returns a ThingyImpl, change it to take or return a Thingy.

Then, build a ThingyManager interface that contains all of ThingyManagerImpl’s public methods. Again, change all ThingyImpl’s in the interface to Thingy’s.

Note that ThingyManagerImpl and ThingyImpl cannot yet implement the interfaces. For example, the method ThingyManagerImpl.modifyThingy(ThingyImpl) is not compatible with the interface method ThingyManager.modifyThingy(Thingy). It’s tempting to modify ThingyManagerImpl as follows to make it match the interface:

public class ThingyManagerImpl implements ThingyManager {

public Thingy getThingyByName(String name) throws Exception {

}

public void modifyThingy(Thingy thingy) throws Exception {

}

}

While that’s technically correct, it will cause you big headaches with your legacy code. The problem is with the return value of getThingyByName(). All of your legacy code (which is too risky to fix right now) uses ThingyManagerImpl directly rather than the interface. And, all of this code expects getThingyByName() to return a ThingyImpl, not a Thingy. That is, if you make the change above, you may have a million lines of legacy code that won’t compile anymore. Here’s a better approach:

public class ThingyManagerImpl implements ThingyManager {

public ThingyImpl getThingyByName(String name) throws Exception {

}

public void modifyThingy(Thingy thingy) throws Exception {

}

}

Now, the concrete getThingyByName() returns a ThingyImpl while the interface version returns a Thingy. Java allows this since a ThingyImpl is a Thingy. Now, all of your legacy code will compile even though it thinks getThingyByName() returns a ThingyImpl. And, when your legacy code passes a ThingyImpl into modifyThingy(), that works too since a ThingyImpl is a Thingy.

So, the rule here for concrete classes is 1) let the methods keep returning concrete types (even though the interface has them returning interfaces), and 2) modify method parameters to use interfaces rather than concrete types.

Once you’ve done this refactoring, which is pretty simple and should take no more than an hour or two, you can easily write your new code strictly against the interfaces, and then you can use the testing techniques described in the previous section.

What if static methods are used? A good rule of thumb in programming is avoid static methods at all costs. Let something like Spring hold all the statics and make all of your methods be normal class methods. But, recognizing that our legacy code isn’t the greatest code in the world, suppose it manages Thingy’s like this:

public class Thingy {

public String getName() {

}

public List<Thingy> getDependentThingies() {

}

public static Thingy getThingyByName(String name) throws Exception {

}

public static void modifyThingy(Thingy thingy) throws Exception {

}

}

It’s now time for you to start cussing. You can’t put the static methods into an interface, and you can’t override them in subclasses. You’re pretty much stuck with the fact that you’re either going to have to do a really expensive, risky refactoring, or your going to have to accept that when your code calls Thingy.getThingyByName(), it forces you to start up the entire system in each of your tests.

Part 2 of this series will discuss a radical approach to solving this problem.

What if you’re forced to subclass legacy code? Suppose your new code needs to plug-in to the legacy system requiring you to write an implementation of legacy interface Processor. Suppose there is a legacy abstract class ProcessorBase that implements all of the methods in Processor and then adds some new abstract methods. The idea is that you plug-in to the system by subclassing ProcessorBase and then implementing the extra abstract methods. The ProcessorBase adds so much functionality that simply writing a new class that just implements Processor is not practical. You really, really need to subclass ProcessorBase.

Of course, this is a poor architecture for the system and an abuse of inheritance (see this article on the evils of inheritance). But, it’s what you’re stuck with.

From a testing standpoint, the problem is that if you make YourNewFunctionalityProcessor extend ProcessorBase, there’s nothing you can do to stub out ProcessorBase in your tests. Your tests will need to start the entire system.

What you really want is to use composition rather than inheritance. That is, you would ideally like ProcessorBase to be a member variable of YourNewFunctionalityProcessor. Then, you could use dependency injection to specify whether the member variable is implemented by ProcessorBase (for production) or your own DummyProcessorBase (for testing). Unfortunately, because ProcessorBase is abstract, you can’t instantiate it and make it a member variable.

Furthermore, overhauling the whole Processor/BaseProcessor plug-in mechanism to work well with composition and modifying all the existing plug-ins to use the new mechanism is expensive, time consuming, and high risk. You could leave all the existing code alone (and deprecated) and build a new composition-based plug-in mechanism. But, then you’d have two sets of plug-in mechanisms to maintain (the inheritance-based mechanism and the composition-based mechanism).

Luckily, there’s a trick you can do. First, write a class YourNewFunctionalityProcessorAdapter that subclasses ProcessorBase. This class should hold a pointer to an instance of YourNewFunctionalityProcessor. It should implement the abstract methods from ProcessorBase and nothing else. Each of these methods should simply delegate to an equivalent method on YourNewFunctionalityProcessor. Here’s what this adapter class might look like:

public class YourNewFunctionalityProcessorAdapter extends ProcessorBase {

private YourNewFunctionalityProcessor processor;

public YourNewFunctionalityProcessorAdapter(YourNewFunctionalityProcessor processor) {

this.processor = processor;

}

public void someMethodRequiredByProcessorBase() {

processor.someMethodRequiredByProcessorBase();

}

}

Then, class YourNewFunctionalityProcessor can just implement the Processor interface. It does not need to extend ProcessorBase. Instead, it contains a member variable of type Processor whose value is specified with dependency injection. In production, the member variable is an instance of YourNewFunctionalityProcessorAdapter. For unit tests, the member variable is a dummy stub. So, YourNewFunctionalityProcessor looks like this:

public class YourNewFunctionalityProcessor implements Processor {

private Processor processorAdapter;

public YourNewFunctionalityProcessor(Processor processorAdapter) {

this.processorAdapter = processorAdapter;

}

public void someMethodRequiredByProcessor() {

// Must delegate all interface methods to the contained processor.

processorAdapter.someMethodRequiredByProcessor();

}

public void someMethodRequiredByProcessorBase() {

// Called indirectly by processorAdapter when method someMethodRequiredByProcessor()

// is called.

//

// All your real work that needs to be tested goes here.

}

}

This trick is a little ugly, a little confusing, and a little tedious. But it allows you to stub out the functionality of ProcessorBase for easy testing.

Conclusion. In this article, I’ve shown some of the benefits of stubbing out dependencies when unit testing. I’ve also shown some refactoring tricks that allow you to stub out the dependencies. Unfortunately, some of the refactorings are kind of ugly; others are expensive and risky. And, you typically end up with some tedious maintenance since all additional future methods will need to be added to your services, their interfaces, and all the stubbed out implementations of those interfaces, even if the code using those stubbed out implementations doesn’t need the new methods.

In the second article in this series, I’ll discuss an alternate, slightly radical approach that allows you to stub out all dependencies in new code without interfaces and without refactoring any legacy code, no matter how poorly it is written. That is, this alternate approach provides all the benefits discussed in this article while avoiding all of the problems.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: