Archive for September, 2009

Clean Code: an example …

Sunday, September 27th, 2009

A friend asked me to review some of the code in his codebase and discuss what if anything I would change to make the code cleaner.

For me the term “Clean” code means all those things outlined in the book “Clean Code” by Robert C. Martin, especially principles like the Single Responsibility Principle, the East / Tell Don’t Ask Principle and the practice of keeping things DRY (Don’t Repeat Yourself) and others. One thing I think is important but not made that clear in various texts is the benefit of keeping the “what” of Business intent clear and uncluttered from the “how” of the implementation. What follows is an example and how it was refactored to be more clean.

My friend has a class that looks through a database of Users to notify them when their Book subscriptions are about to expire. Not all the Users want to be notified when this happens. The typical implementation that I saw was a single class that gets the users, iterates through them and dispatches an email to each User that wants to be notified.

A1 public void notifyExpiredUsers() {
A2   Date date = new Date();
A3   List<User> users = getUsers();
A4   for (User user : users) {
A5     if (date > user.getExpiryDate() && user.getEmailNotification()) {
A6       sendEmailTo(user.getName(), user.getEmailAddress());
A7     }
A8   }
A9 }

We then discussed what I would like to change and why …

The first thing I would like to do is clean this method up to express the intent more clearly, with each method having a single responsibility.

B1 public void notifyUsersWhoseBookSubscriptionsHaveExpired() {
B2   List<User> bookSubscribers = getBookSubscribers();
B3   for (User bookSubscriber : bookSubscribers) {
B4     notifyUserWhoseBookSubscriptionHasExpired(bookSubscriber)
B5   }
B6 }

C1 public void notifyUserWhoseBookSubscriptionHasExpired(User bookSubscriber) {
C5   if (hasBookSubscriptionExpired(bookSubscriber)) {
C6     sendSubscriptionExpiryNotificationTo(bookSubscriberNotificationDetails(bookSubscriber));
C7   }
C9 }

* note that the method in #C6 bookSubscriberNotificationDetails() returns a new object BookSubscriberNotificationDetails.

D1 public boolean hasBookSubscriptionExpired(User bookSubscriber) {
D2   Date today = new Date();
D2   return (date > bookSubscriber.getSubscriptionExpiryDate() && bookSubscriber.getEmailNotification());
D3 }

To me the code is looking a lot better already as we have:

1. Used a name that represents the business WHAT, a “bookSubscriber” rather than the HOW, a “User”.
2. Removed several responsibilities of the first “notifyExpiredUsers()” method into separate methods with single responsibilities.
3. Replaced a combination of conditions in the if() statement (A5) with a call to a method that describes the Business intent (C5) rather than the mechanism used to find out how.
4. Introduced a new class BookSubscriberNotificationDetails in C6 to capture the binding between the notification process and where in the model the information comes from.
5. Replaced the call to send an Email in C6 with a call to a method that describes the intent rather than the how.

There is still a problem with C5 which it has carried on from A5 and that is the combination of conditions that make the code less clear. The combination of a book subscription expiring and the subscriber wanting to be notified should be exposed through the naming of the method. So lets change that.

E1 public void notifyUserWhoseBookSubscriptionHasExpired(User bookSubscriber) {
E5   if (shouldNotifyThatBookSubscriptionHasExpired(bookSubscriber)) {
E6     sendSubscriptionExpiryNotificationTo(bookSubscriberNotificationDetails(bookSubscriber));
E7   }
E9 }

F1 public boolean shouldNotifyThatBookSubscriptionHasExpired(User bookSubscriber) {
F2   return (hasBookSubscriptionExpired(bookSubscriber) && bookSubscriberWantsNotification(bookSubscriber));
F3 }
F4
F5 public boolean bookSubscriberWantsNotification(User bookSubscriber) {
F6   return bookSubscriber.getEmailNotification();
F7 }

While this is an improvement because the business intent of the code is more clear, there are still some changes we can make to the code to further to minimize change impact by hiding data and moving behavior close to the owner of the associated data.  This is in line with East Principles. The first change to go East will be with D2 where the bookSubscribers subscription expiry date is exposed to the caller (travels West).  Also exposed is the email notification value, so lets change them.

G1 public boolean hasBookSubscriptionExpired(User bookSubscriber) {
G2   Date today = new Date();
G3   return bookSubscriber.isSubscriptionExpiryDateBefore(today);
G4 }
G5
G6 public boolean bookSubscriberWantsNotification(User bookSubscriber) {
G7   return bookSubscriber.isEmailNotificationRequested();
G8 }

The code in G3 and G7 changed slightly to not expose values from the bookSubscriber and the remifications of this are subtle but far reaching. They are:

1. We have captured in one place the Business fact that we care about subscription expiry dates that are before other dates. At this stage we don’t have an isSubscriptionExpiryDateAfter(date), so we can tell the Business does not yet do things after an expiry. Looking through the codebase for usages of “<” or “>” would much harder.
2. We have turned a simple get of a boolean value into a descriptive Business reason for its existance.

A lot of the above modification may just appear superficial with splitting methods up or using different names and to some degree that is true but what should be more clear is the Business WHAT and not the implementation HOW. Keeping the Business WHAT clear and uncluttered from the implementation HOW is part of what Clean code is to me. We should be able to modify the implementation HOW quite significantly without effecting the Business WHAT.

Personally I would go further and make a BookSubscribers class that represents the behviour that is related to a collection of BookSubscribers. I would do it because the Business talks about Book Subscribers so this concept should make it into the domain model, and I would do it because the operations on groups of Subscribers belongs in one place and not spread around between helper classes or the methods of other unrelated classes. However, this is another post waiting to happen soon.

So is the result cleaner and more clear, please let me know what you think.

switch() blade …

Saturday, September 26th, 2009

switchblade

 As some of you may be aware I am currently writing a Compiler for Smalltalk that will output JVM bytecode. The compiler is in Java and as part of that compiler I have a large switch() statement which is very long (600+ lines) and in my opinion ugly and not OO.  When I mentioned to a friend how the switch statement offended my sensibilities they replied “but it’s part of the language, so why not use it.” and it got me thinking about when to use it, when not to use it and more importantly how to write code that achieves the same result as a switch() but in a clean, readable, DRY way.

Two things I would like to make clear are that 1) the switch statement has a place but 2) it is not OO and in most every case it can and probably should be avoided.

When you need logic executed based on some facet/attribute of an object then the switch and the alternative approaches outlined can be applied.

To work though the pro’s and con’s and to show alternative approaches to a switch() we need to have an example to work through, so imagine we have to code a machine that prepares fruit. This machine takes a hopper of different fruits and depending on what fruit it is, slices, peels, juices it or a combination of these and puts the result in an output hopper. We are also working in Java which is a key point since other languages make it easier to avoid the switch() statement because they don’t have type erasure which happens when putting classes in a collection of a common base type. ie: when you take them out of the collection they have lost their type so you cant call overloaded methods with them.

Simplified, the core function of the machine might typically look like this with a switch statement:

handleFruit(List<Fruit> inputFruitHopper, List<Fruit> outputHopper) {
  this.outputHopper = outputHopper;
  for (Fruit fruit : inputFruitHopper)
    handleFruit(fruit);
}

handleFruit(Fruit fruit) {
  switch (fruit.type()) {
    case APPLE: ...
    case BANANA: ...
    case PEAR: ...
    case GRAPE: ...
    case ORANGE: ..
    default:
      throw whatFruitIsThisException();
    }
}

To me the problems with this approach are:

1) The handler has to have knowledge about some part of the Fruit in order to dispatch to the appropriate handling. This binds the dispatcher to the fruit which may seem like no big deal but:
1.1) it limits the ability to have a design that is more dynamic and that doesn’t have to change when new types of Fruit are added. ie: to support Kiwifruit you have to change the handleFruit(Fruit) method.
1.2) it locks in the mechanism for defining types of fruit (fruit.type()). We could use something else like the Class or a name as the switcher but the net effect is the same. 1.3) it exposes internals of fruit which voids information hiding, a principle of OO.
1.4) the tests for this method will be very large especially if they are to cover each case and the default case.
1.5) the possibility of reuse of this handler is very low, therefore you may have to write and test other dispatchers again and again. ie: if you had to weigh each fruit in a hopper would you want to code the same switch statement again?
1.6) switch() statements have a habbit of growing very large.
1.7) possibly the worst thing about this approach is that the logic for handling each type of Fruit is here within the handler who’s single responsibility (SRP) should be just the dispatching.

To me the benefits of this approach are:

1.8) It is said to be performant but I have not yet tested the performance of this approach over the others that can be used.
1.9) Im stuck on thinking of any other benefit.

There are alternatives to using a switch() which is what this post is all about and time permitting I’ll explore each of them at least briefly.

2) Use a Command Dispatch Pattern

The command pattern approach is where a hashtable of “command” objects is kept keyed on the Class of the expected objects. With this approach the handleFruit(Fruit) method could look like this:

handleFruit(Fruit fruit) {
  FruitCommand fruitCommand = getHandlerFor(fruit.getClass());
  if (fruitCommand == null)
    throw whatFruitIsThisException();
  fruitCommand.executeWith(fruit, outputHopper);
}

To me the benefits of this approach are:

2.1 The handler is short and easy to understand an test.
2.2 The handler doesn’t have to change to handle new types of Fruit.
2.3 The handler has a single responsibility, that of dispatching the Fruit to the appropriate command.
2.4 The logic to be applied to each fruit (slicing, peeling, juicing) is not part of the handler. This means new logic can be added without changing the handler.
2.5 While we use the Class of fruit to determine the command to be applied, we don’t expose the internals of the Fruit. There is a subtle difference between using a mechanism of the language and what we define as state/properties of a model. I see the former as less evil.
2.6 The testing of dispatcher and of each command is separated and therefore simpler.
2.7 This dispatcher could be generified with Java generics to be reused in other parts of the code.

To me the problems with this approach are:

2.8 The type of Fruit is still unknown to the command which will be required to ‘cast’ the Fruit to a subtype. ie: FruitCommand.executeWith(Fruit fruit, List<Fruit> outputHopper)
2.9 The switch statement did the dispatching and the registration of what logic (command) applies to what Fruit. In the command pattern approach this needs to be somewhere else, typically in an initialization mechanism, be it code or an IOC container.
2.10 It appears it would be less performant than a switch but I have not tested this.
2.11 As mentioned in 2.5 we use the Class of the object to lookup the command requiring each different Fruit to be a subtype. Personally I think they should be anyway - this is OO.

3) Use a Double Dispatch

This approach is where the object that is used to determine the logic to be applied is asked to call back on the dispatcher. This is how it might look:

handleFruit(Fruit fruit) {
  fruit.callback(this);
}

To me the benefits of this approach are:

3.1) The handler is very simple.
3.2) The callback can be to a method on the dispatcher or a subclass of it that accepts the specific Fruit subtype. ie:  the handler would have methods like: handle(Apple apple) So the logic gets the expected type without a cast.
3.3) It appears it would be faster than the Command Dispatch Pattern, but I have not tested this.
3.4) Code wise it is ‘neater’ then a large switch().

To me the problems with this approach are:

3.5 This is a very similar approach to the switch statement and would require changing the handler to
cater for new Fruit types.
3.6 The other negatives with the switch apply here as well, except for some of the testing concerns since each handler method can be tested independently of the rest of the dispatch logic.
3.7 There is now a direct binding between the Fruit and the dispatcher so it can make a callback. This can be made more subtle in a wrapper or subclass.

Conclusion

I have presented two alternatives to using a switch() statement and there are probably others. The key point I’m trying to make is that just because a language has a feature doesnt make it right to use it without consideration. What makes an approach “right” is a balance between readability, lowering duplication in code and effort, cost to change and lastly performance.

For my Smalltalk parser I still have a switch statement right now and after I have done some performance testing on the alternatives above and possibly others I will make a choice between the approaches, most likely favoring readability and simplicity. Which of the approaches is readable and simple to me may vary greatly to you which, makes for interesting conversations, but please don’t use a switch as a no thought default.