79

I'm reading McConell's Code Complete, and he discusses using boolean variables to document your code. For example, instead of:

if((elementIndex < 0) || (MAX_ELEMENTS < elementIndex) || 
   (elementIndex == lastElementIndex)){
       ...
}

He suggests:

finished = ((elementIndex < 0) || (MAX_ELEMENTS < elementIndex));
repeatedEntry = (elementIndex == lastElementIndex);
if(finished || repeatedEntry){
   ...
}

This strikes me as logical, good practice, and very self-documenting. However, I'm hesitant to start using this technique regularly as I've almost never come across it; and perhaps it would be confusing just by virtue of being rare. However, my experience is not very vast yet, so I'm interested in hearing programmers' opinion of this technique, and I'd be curious to know if anyone uses this technique regularly or has seen it often when reading code. Is this a worthwhile convention/style/technique to adopt? Will other programmers understand and appreciate it, or consider it strange?

Paul R
  • 208,748
  • 37
  • 389
  • 560
froadie
  • 79,995
  • 75
  • 166
  • 235
  • @Paul R - whoops, thanks. I started with self-documentation and realized there was no tag for that, so tried to change it to self-documenting which already existed. :) – froadie Mar 19 '10 at 14:52
  • Isn't just commenting on what's happening in the test better? – J S Mar 19 '10 at 15:59
  • 3
    When appropriate, I try to do this also, especially if I have to check for the condition(s) in different parts of the method. It's much more descriptive too. – geffchang Mar 19 '10 at 16:08
  • "This strikes me as logical, good practice, and very self-documenting." Just based on your experience, chances are it will be as easily understood by other programmers who come across it for the first time. –  Mar 20 '10 at 10:02
  • Some automatic refactoring tools allow you to, with one command, switch between the two forms. For instance, you'd refactor "elementIndex == lastElementIndex" in the original code with "extract a local variable". Or in the second code, you'd select repeatedEntry in the if statement and refactor it to be inlined. – JeffH Mar 22 '10 at 20:24
  • 2
    Why is this question still open? – orangepips Feb 01 '11 at 21:42

14 Answers14

55

Splitting an expression that's too nested and complicated into simpler sub-expressions assigned to local variables, then put together again, is quite a common and popular technique -- quite independently of whether the sub-expressions and/or the overall expression are boolean or of just about any other type. With well-chosen names, a tasteful decomposition of this kind can increase readability, and a good compiler should have no trouble generating code that's equivalent to the original, complicated expression.

Some languages that don't have the concept of "assignment" per se, such as Haskell, even introduce specialized constructs to let you use the "give a name to a subexpression" technique (the where clause in Haskell) -- seems to bespeak some popularity for the technique in question!-)

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
16

I have used it, though normally wrapping boolean logic into a reusable method (if called from multiple locations).

It helps readability and when the logic changes, it only needs changing in one place.

Other will understand it and will not find it strange (except for those who only ever write thousand line functions, that is).

Oded
  • 489,969
  • 99
  • 883
  • 1,009
  • Thanks for the answer! Regarding the reusable methods, that has another (unrelated) valid reason to be factored out... So I suppose my question really is if you should factor out one-time boolean expressions, when there's no reason other than readability. (Which of course is a big enough reason on its own :) ) Thanks for pointing that out. – froadie Mar 19 '10 at 15:03
  • +1 for reducing the code changes necessary when the logic changes, and making fun of thousand-line-function programmers. – Jeffrey L Whitledge Mar 19 '10 at 17:46
9

I try to do it wherever possible. Sure, you are using an "extra line" of code, but at the same time, you are describing why you are making a comparison of two values.

In your example, I look at the code, and ask myself "okay why is the person seeing the value is less than 0?" In the second you are clearly telling me that some processes have finished when this occurs. No guessing in the second one what your intent was.

The big one for me is when I see a method like: DoSomeMethod(true); Why is it automatically set to true? It's much more readable like

bool deleteOnCompletion = true;

DoSomeMethod(deleteOnCompletion);
froadie
  • 79,995
  • 75
  • 166
  • 235
kemiller2002
  • 113,795
  • 27
  • 197
  • 251
  • 7
    I dislike boolean parameters for precisely this reason. You end up getting calls like "createOrder(true, false, true, true, false)" and what does that mean? I prefer to use enum's, so you say something like "createOrder(Source.MAIL_ORDER, BackOrder.NO, CustomOrder.CUSTOM, PayType.CREDIT)". – Jay Mar 19 '10 at 17:31
  • But if you follow Kevin's example, it's equivalent to yours. What difference does it make whether the variable can take on 2 values or more than 2? – Mark Ruzon Mar 19 '10 at 19:15
  • 2
    With Jay's you can have the advantage of definitely being clearer in certain instances. For example, in use of the PayType. If it was a boolean the parameter would probable be named, isPayTypeCredit. You don't know what the altertnative is. With an enum you can clearly see what options the PayType is: Credit, Check, Cash and select the correct one. – kemiller2002 Mar 19 '10 at 19:47
  • ++ also an enumeration doesn't allow null assignment so value control is quite literally complete, and the auto-documentation of the enum is gravy on top – Hardryv Mar 24 '10 at 17:06
  • Objective-C and Smalltalk really solve this problem, in Objective-C: `[Object createOrderWithSource:YES backOrder:NO custom:YES type:kCreditCard];` – Grant Paul Mar 28 '10 at 19:37
5

The provided sample:

finished = ((elementIndex < 0) || (MAX_ELEMENTS < elementIndex));
repeatedEntry = (elementIndex == lastElementIndex);
if(finished || repeatedEntry){
   ...
}

Can also be rewritten to use methods, which improve readability and preserve the boolean logic (as Konrad pointed out):

if (IsFinished(elementIndex) || IsRepeatedEntry(elementIndex, lastElementIndex)){
   ...
}

...

private bool IsFinished(int elementIndex) {
    return ((elementIndex < 0) || (MAX_ELEMENTS < elementIndex));
}

private bool IsRepeatedEntry(int elementIndex, int lastElementIndex) {
    return (elementIndex == lastElementIndex);
}

It comes at a price, of course, which is two extra methods. If you do this a lot, it may make your code more readable, but your classes less transparent. But then again, you could also move the extra methods into helper classes.

Prutswonder
  • 9,894
  • 3
  • 27
  • 39
  • If you got into alot of code noise with C# you can also take advantage of partial classes and move the noise to the partial and if people are interested in what IsFinished is checking it's easy to jump to. – Chris Marisic Mar 19 '10 at 17:45
3

The only way I could see this going wrong is if the boolean fragment doesn't have a name that makes sense and a name is picked anyhow.

//No clue what the parts might mean.
if(price>0 && (customer.IsAlive || IsDay(Thursday)))

=>

first_condition = price>0
second_condition =customer.IsAlive || IsDay(Thursday)

//I'm still not enlightened.
if(first_condition && second_condition)

I point this out because it is commonplace to make rules like "comment all your code", "use named booleans for all if-criteria with more than 3 parts" only to get comments that are semantically empty of the following sort

i++; //increment i by adding 1 to i's previous value
MatthewMartin
  • 32,326
  • 33
  • 105
  • 164
  • 2
    You've mis-grouped the conditions in your example, haven't you? '&&' binds tighter than '||' in most languages that use them (shell scripts are an exception). – Jonathan Leffler Mar 20 '10 at 06:11
  • Parens added. So this would be another strike against splitting to named variables, it introduces an opportunity to change the grouping in an non-obvious fashion. – MatthewMartin Mar 23 '10 at 13:46
3

Remember that this way you compute more than necessary. Due to taking conditions out from the code, you always compute both of them (no short circuit).

So that:

if((elementIndex < 0) || (MAX_ELEMENTS < elementIndex) || 
   (elementIndex == lastElementIndex)){
   ...
}

After the transformation becomes:

if((elementIndex < 0) || (MAX_ELEMENTS < elementIndex) |
   (elementIndex == lastElementIndex)){
   ...
}

Not an issue in most cases, but still in some it may mean worse performance or other issues, e.g. when in the 2nd expression you assume the 1st one has failed.

Konrad Garus
  • 53,145
  • 43
  • 157
  • 230
  • True - I was just noticing this now as I went back to some of my code to refactor a couple of if statements using this method. There was an if statement taking advantage of short circuit evaluation with an && (the second part throws an NPE if the first part is false), and my code was failing when I refactored this (because it was *always* evaluating both and storing them in boolean variables). Good point, thanks! But I was wondering as I tried this - is there any way to store the *logic* in a variable and have it delay evaluation until the actual if statement? – froadie Mar 19 '10 at 15:24
  • 2
    I actually doubt that the compiler will resolve that. If the second call is not effective, it's usually due to call of some function, and AFAIK no compiler is trying to determine if a called function is without side-effects. – J S Mar 19 '10 at 15:58
  • You can nest the IFs, and don't do the later calculations unless the first test is insufficient to decide whether to proceed. – Jay Mar 19 '10 at 17:34
  • 1
    @froadie Some languages like Kotlin (and soon Dart) allow "lazy" variables that will only be calculated when used. Alternatively, putting the logic in a function instead of a variable will have the same effect here. Just, y'know, in case you still want to know 10 years later. – hacker1024 Dec 31 '20 at 11:31
2

By doing this

finished = ((elementIndex < 0) || (MAX_ELEMENTS < elementIndex));
repeatedEntry = (elementIndex == lastElementIndex);
if(finished || repeatedEntry){
   ...
}

you remove the logic from your brain and put it in the code. Now the program knows what you meant.
Whenever you name something you give it physical representation. It exists.
You can manipulate and reuse it.

You can even define the whole block as a predicate:

bool ElementBlahBlah? (elementIndex, lastElementIndex);

and do more stuff (later) in that function.

Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
  • And more importantly, the next dev to look at the code will know what you meant too! This is great practice, and I do it all the time. – Chris Thornton Mar 19 '10 at 18:54
  • 1
    Also, that next dev may often be yourself, looking at the code again after a few months (or even a few weeks) and being glad it was well self-documented. – Louis Nov 28 '14 at 18:24
2

I think it's better to create functions/methods instead of temporary variables. This way readability is increased also because methods get shorter. Martin Fowler's book Refactoring has good advice for improving code quality. Refactorings related to your particular example are called "Replace Temp with Query" and "Extract Method".

mkj
  • 2,761
  • 5
  • 24
  • 28
  • 2
    Are you saying that by cluttering the class-space with a lot of one-off functions, you're increasing readability? Please explain. – Zano Mar 19 '10 at 16:50
  • It's always a trade off. The readability of the original function will improve. If the original function is short it might not be worth it. – mkj Mar 19 '10 at 16:55
  • Also "cluttering the class-space" is something I think depends on the language used and how you partition your code. – mkj Mar 19 '10 at 17:01
2

Personally, I think that this is a great practice. It's impact on the execution of the code is minimal, but the clarity that it can provide, if used properly, is invaluable when it comes to maintaining the code later.

GrowlingDog
  • 136
  • 7
1

If the expression is complex then I either move it to another function which returns a bool eg, isAnEveningInThePubAGoodIdea(dayOfWeek, sizeOfWorkLoad, amountOfSpareCash) or reconsider the code so that such a complex expression isn't required.

Benedict Cohen
  • 11,912
  • 7
  • 55
  • 67
1

if the method requires a notification of success: (examples in c#) I like to use the

bool success = false;

to start out. the code's a falure until I change it to:

success = true;

then at the end:

return success;
Chris Hayes
  • 3,876
  • 7
  • 42
  • 72
0

I think, it depends on what style you / your team prefer. "Introduce variable" refactoring could be useful, but sometimes not :)

And I should disagree with Kevin in his previous post. His example, I suppose, usable in case, when introduced variable can be changed, but introducing it only for one static boolean is useless, because we have parameter name in a method declaration, so why duplicate it in code?

for example:

void DoSomeMethod(boolean needDelete) { ... }

// useful
boolean deleteOnCompletion = true;
if ( someCondition ) {
    deleteOnCompletion = false;
}
DoSomeMethod(deleteOnCompletion);

// useless
boolean shouldNotDelete = false;
DoSomeMethod(shouldNotDelete);
dchekmarev
  • 459
  • 3
  • 5
0

In my experience, I've often returned to some old scripts and wondered 'what the hell was I thinking back then?'. For example:

Math.p = function Math_p(a) {
    var r = 1, b = [], m = Math;
    a = m.js.copy(arguments);
    while (a.length) {
        b = b.concat(a.shift());
    }
    while (b.length) {
        r *= b.shift();
    }
    return r;
};

which isn't as intuitive as:

/**
 * An extension to the Math object that accepts Arrays or Numbers
 * as an argument and returns the product of all numbers.
 * @param(Array) a A Number or an Array of numbers.
 * @return(Number) Returns the product of all numbers.
 */
Math.product = function Math_product(a) {
    var product = 1, numbers = [];
    a = argumentsToArray(arguments);
    while (a.length) {
        numbers = numbers.concat(a.shift());
    }
    while (numbers.length) {
        product *= numbers.shift();
    }
    return product;
};
rolandog
  • 159
  • 4
  • 9
  • -1. This has a little to do with the original question, to be honest. The question is about a different, very specific thing, not about how to write good code in general. – P Shved Mar 19 '10 at 18:45
0

I rarely create separate variables. What I do when the tests get complicated is nest the IFs and add comments. Like

boolean processElement=false;
if (elementIndex < 0) // Do we have a valid element?
{
  processElement=true;
}
else if (elementIndex==lastElementIndex) // Is it the one we want?
{
  processElement=true;
}
if (processElement)
...

The admitted flaw to this technique is that the next programmer who comes along may change the logic but not bother to update the comments. I guess that's a general problem, but I've had plenty of times I've seen a comment that says "validate customer id" and the next line is examining the part number or some such and I'm left to wonder where the customer id comes in.

Jay
  • 26,876
  • 10
  • 61
  • 112