October 12, 2008

Short Circuit Statements

As promised, it's finally time to move away from theory into a more specific topic about coding. My goal is to make most posts in the future more like this one, now that the most important concepts have been laid out. I'll refer back to them often, so if you haven't read the prior posts, you'll probably want to do that before going further.

What is a short-circuit statement? In this case, I'm not talking about the language feature related to boolean comparisons, but instead I'm talking about statements which cause a method to return as soon as some conclusion has definitely been reached. Here's an example from a very simple compareTo method in Java:

public int compareTo(Foobar that) {
if (that == null) return -1;
if (this._value < that._value) return -1;
if (this._value > that._value) return +1;
return 0;
}


In this example, each line (except the last one) would qualify as a short circuit statement; that is, they all return as soon as a definite answer is determined. If we weren't using short circuit statements, then the code may look like this:

public int compareTo(Foobar that) {
int result = 0;
if (that == null) {
if (this._value == that._value) {
if (this._value < that._value) {
result = -1;
} else {
result = +1;
}
}
}
return result;
}


For something this simple, there isn't a huge difference in the complexity between the two functions, but it still demonstrates the point. Many people ardently recommend always having a single return statement in any function, and would strongly advocate using the second example over the first. However, I would argue that the first is superior because it better respects the Unit Economy of the reader.

Short circuit statements support Unit Economy because they allow a reader to take certain facts for granted for the remainder of a method. In the first example after reading the first line of the method the reader knows that they will never have to worry about the that variable having a null value for the rest of the method. In the second example, the reader will have to carry the context of whether he is still reading code within the first if statement. Every short circuit statement removes one item from the set of things one must consider while reading the remainder of the method.

Naturally, this example is pretty simplistic, and it's a stretch to claim that either method is more complicated than the other. However, consider if this weren't a simple example. If this were a formula to compute an amortization table for a home mortgage, then the first few lines may look like this:

public AmortizationSchedule computeSchedule(
int principle, float rate, int term) {
if (principle <= 0) throw new IllegalArgumentException(
"Principle must be greater than zero");
if (rate < 0.0) throw new IllegalArgumentException(
"Rate cannot be less than zero");
if (term <= 0) throw new IllegalArgumentException(
"Term must be greater than zero");

// Here is the 20 or so lines to compute the schedule...
}


In this case, there may be a substantial amount of code following this brief preamble, and none of it has to consider what may happen if these invariants are broken. This greatly simplifies the act of writing the code, the logic of the code itself, and the process of maintaining the code later on.

Thoughts? Comments? I'd love to hear them.

October 05, 2008

Coupling & Cohesion, Part III

Last time, I described how Cohesion applies to everything from writing a single line of code all the way up to designing a remote service. Now, let's consider the same thing for Coupling.

Recall that Coupling is the mental load required to understand how a particular component relates to another compoent. If we take a line of code as a single component, then what defines how it is connected to the lines around it? For a start: the local variables it uses, methods it calls, conditional statements it is part of, the method it is contained in, and exceptions it catches or throws. The more of these things a single line of code involves, the more coupled it is to the rest of the system.

As an example, consider a line of code which uses a few local variables to call a method and store the result. This could be more or less coupled depending upon a number of factors. How many local variables are needed? Are any of the variables static or global variables? Is the method call private to the class, a public method on another class, or a static method defined somewhere? Is the result being stored in a local variable, an instance variable, or a static/global variable? Depending upon the answers to these questions, that one line may be more or less coupled to the other lines around it.

The implication of having Coupling which is too tight for a single line of code is that you have to understand a lot of other lines in order to understand that one. If it uses global variables, then you have to also understand what other code modifies the state of those variables. If it uses many local variables, then you have to understand the code which sets their values. If it calls a method on another object, then you have to understand what impact that method call will have. All of these things increase the amount of information you need to keep in mind to understand that line of code.

Now, consider what Coupling would mean for a remote service which is part of a large distributed system (e.g. Amazon.com). The connections such a service has are defined by the API it offers, the other services it consumes, and how their APIs are defined. For the service's own API, consider the following: Does the API respond to many service calls or just a few? Do the service calls require a lot of structured data to be passed in? How easy is it for a caller to obtain all the necessary information? How much is the service's internal implementation attached to the API it presents? How common is the communication protocol clients must implement? For the other services it consumes, consider: How many other services does it use? How are their APIs defined (considering the questions above)? Just as with a single line of code, the answers to these questions will define how tightly coupled a service is to the rest of the system around it.

Having Coupling which is too tight for a remote service carries troubles, too. Changes to downstream systems may force the service to need an update. Any change to the API may require upstream services to change as well. It may be impossible to change the service's implementation if it is too tightly coupled to its own API. Finally, it may be difficult to break the service into separate services as it grows in scope. It can be a costly and painful mistake down the road to allow too much coupling between services in a distributed environment.

Okay... enough theory! Next time, on to a more specific subject: Short Circuit statements.