break vs goto: Which is more appropriate in the following bool function in C?

Question

I wrote the following function which is meant to check if there is any non digit char in a string. The function must stop right after finding any non digit char and exit the loop and return true. Else, it will return false. Here's the code I am using: (ctype.h library is included).

bool isnotdigit(string argv)
{
    bool y = false;
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            y = true;
            break;
        }
    }
    return y;
}

It can also be done this way:

bool isnotdigit(string argv)
{
    bool y = false;
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            y = true;
            goto next;
        }
    }
next:
    return y;
}

If I am not wrong, both codes work the same way. Right? Then, what is the pros and cons of both? Especially in case of the bool function above.

`break` is a strictly constrained `goto`, and those constraints are usually good. In this case they do exactly the same thing, but in other contexts the cavalier use of `goto` can produce cryptic code prone to subtle and devastating bugs. — Beta, Feb 01 '21 at 14:59
superior: just don't have a `bool y`. you should just `return true` instead of `break` or `goto`, and `return false` at the end if no iteration already returned true. the `bool y` serves no purpose here. if you did that so you could only have a single point of exit from the function, it's time to realise that doesn't actually matter (unless you can profile it and prove otherwise in some specific situation). — underscore_d, Feb 01 '21 at 15:01
@underscore_d *you should just `return true`* Yes, you should. Unfortunately, there are many environments that impose an "only one return point" rule in their code standards. IMO that's utterly asinine and I've never found any empirical evidence that such a standard does anything but force the creation of convoluted code that force-fits simple-to-understand "square" algorithms like this into round holes, but they do exist. — Andrew Henle, Feb 01 '21 at 15:11
@AndrewHenle loud round of sincere applause:) Stupid rules:( — Martin James, Feb 01 '21 at 16:18
Hmm, and no one - including me - caught that in what's a "coding standard" question the use of the [CS50-addle-brained](https://www.dictionary.com/browse/addlebrained) `typedef char *string` abomination. Lose that construct - hiding a pointer behind a `typedef` is almost always a really bad idea. When you're trying to figure out what some code does, you ***need*** to know what the variables actually ***are***. — Andrew Henle, Feb 01 '21 at 17:58

Steve Summit · Accepted Answer · 2021-02-01T15:34:34.107

This is a programming style question, and as such, you are probably not going to get a definitive answer.

Some say that break is just a goto in disguise, so that one is as bad as the other. Some people say you should never use break.

But the whole point of break is that it's a guaranteed-to-be-unconfusing goto. It always goes from the inside of a loop, to the outside of a loop. (Well, except when it's part of a switch statement, of course.)

In this case, the goto buys you little or nothing. So I don't think anyone would say that your second example is much better than your first. (When there are nested loops, things can get more complicated. More on that in a bit.)

And, finally, yet another way of writing your function is like this:

bool isnotdigit(string argv)
{
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            return false;
        }
    }
    return true;
}

But some people say that this is bad style, because (they say) a function should have exactly one return statement. Other people, however, say that this is good style, because it gets rid of the extra boolean variable y, and getting rid of extra variables (especially extra little Boolean variables that just keep track of things) is a good rule, too.

Here are my opinions:

goto can be confusing, and you almost never need it. I almost never use goto.
break is fine. It's almost never confusing. I use it all the time. So I would very much prefer your first fragment over your second.
Extra boolean variables, that just keep track of little things, can be confusing, especially when the only reason they're there is to get around some other "rule".
I have no problem with multiple return statements, so I would always use something more like the third alternative, as I've presented it in this answer.

See also this older question.

Now, in fairness, the argument against multiple return statements can be a good one. If you've got cleanup to do before you exit, or if you ever have to add some cleanup code in the future, it can be very easy to forget to add it in both places (or in three or more places) if there are multiple return statements. For me, when the function is small and simple (as it is in this case), I think the cleanliness (and the loss of the extra Boolean variable) outweighs the risks. But if your metastyle is that you don't like judgment calls, if you like rigid rules that you can apply everywhere to avoid the superficial risks no matter what, then the "no multiple return statements" rule makes sense.

Finally, there's the question of nested loops. The only justification I can imagine for the goto next; usage in your second example is that if a later programmer comes along and adds a nested loop, the code with break would probably not work any more, and would have to be reworked somehow. By using goto, the rationalization might go, the code is more robust against that possibility. (Personally, I don't think that's a good rationalization for the goto, but as I say, it's the only one I can think of.)

Once you have nested loops, the pros and cons (that is, goto versus break versus multiple return statements) definitely shift around, as discussed in the answers to that other question.

P.S. Some languages "solve" the break-out-of-nested-loops problem by having something like break(2) that can break out of multiple loops at once. C does not have this, and the reason is that it was felt to be too potentially confusing, and too likely to, um, break if a later maintenance programmer added or removed a level of nesting.

Or in other words, although single-level-break is a guaranteed-to-be-unconfusing goto, multi-level-break is potentially just as confusing as goto. And, of course, you can argue with my characterizations here: after all, who says that break is guaranteed to be unconfusing? It's not strictly guaranteed, of course, but that's because any language feature can be confusing if you use it badly. (Case in point: extra little boolean variables that just keep track of various things.)

*If you've got cleanup to do before you exit, or if you ever have to add some cleanup code in the future, it can be very easy to forget to add it in both places* That's just an argument for not duplicating code. It's not an actual argument against multiple return points. — Andrew Henle, Feb 01 '21 at 15:38

score 2 · Answer 2 · answered Feb 01 '21 at 14:59

2

They are functionally equivalent, but I would prefer to only use goto to break out of a nested loop, not a single loop.

goto is often frowned upon because it can lead to "spaghetti code", but it has its place in C for:

Breaking cleanly out of a nested loop

Resource deallocation, as so:

  char *buffer1 = NULL, *buffer2 = NULL, *buffer3 = NULL;

  buffer1 = malloc(1000);
  if(NULL == buffer1)
  {
      goto cleanup;
  }

  buffer2 = malloc(1000);
  if(NULL == buffer2)
  {
      goto cleanup;
  }

  buffer3 = malloc(1000);
  if(NULL == buffer3)
  {
      goto cleanup;
  }

  use(buffer1, buffer2, buffer3);

cleanup:

  if(buffer1 != NULL)
  {
      free(buffer1);
      buffer1 = NULL;
  }

  if(buffer2 != NULL)
  {
      free(buffer2);
      buffer2 = NULL;
  }

  if(buffer3 != NULL)
  {
      free(buffer3);
      buffer3 = NULL;
  }

answered Feb 01 '21 at 14:59

Govind Parmar

20,656
7
53
85

1

` if(buffer1 != NULL) { free(buffer1); buffer1 = NULL; }` is not any different from `free(buffer1); buffer1 = NULL;` as `free(NULL);` is a well-defined no-op. – Andrew Henle Feb 01 '21 at 15:14
@AndrewHenle The `NULL` check skips the overhead of calling the `free` function at all if it isn't needed – Govind Parmar Feb 01 '21 at 15:15
1

Until you've profiled your code and found that's an actual performance bottleneck, that's a needless and premature optimization. And for all you know (and you don't because you didn't profile the code...), the extra `if` check might wind up being slower. – Andrew Henle Feb 01 '21 at 15:18
@AndrewHenle `malloc`/`free` was just a simple, familiar, example of resource management for my answer; there are other resource management functions where `ReleaseResource(UnallocatedResource)` is *not* benign. For example, I sometimes use the ODBC functions `SQLAllocHandle` and `SQLFreeHandle` with the same code pattern, and `SQLFreeHandle(x, SQL_NULL_HANDLE);` *will* cause an error, so in that case the check is needed. Good to include the check since sometimes you will actually need it with this pattern. – Govind Parmar Feb 01 '21 at 15:35

score 2 · Answer 3 · answered Feb 01 '21 at 15:04

2

Never use goto if there are alternatives.

But in this function you need neither, just return when the result is clear:

bool isnotdigit(string argv)
{
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            return true;
        }
    }
    return false;
}

answered Feb 01 '21 at 15:04

koder

2,038
7
10

Functions with multiple exit points can lead to defects when taken over by maintainers. Someone unfamiliar with the code may make a mistake when trying to fix a defect because they overlook one or more of the exit paths. – Feb 01 '21 at 15:08
@SakibArifin Why would the _program_ end? The _function_ is all that ends, by returning the value it was told to `return`. – underscore_d Feb 01 '21 at 15:13
2

@DavidCullen If new maintainers would be confused by multiple exit points, I think that just means the function is too large independent of how it returns... or perhaps the new maintainers just aren't ready for the job they've been given. – underscore_d Feb 01 '21 at 15:14
4

@DavidCullen Cramming this algorithm into a one-size-fits-all "one return point" rule does nothing but lead to more complex and therefore more-bug-prone code. – Andrew Henle Feb 01 '21 at 15:17
@underscore_d Sorry, by program I meant the function. – Sakib Arifin Feb 01 '21 at 15:19
@AndrewHenle I agree, but as I wrote in my answer, "If your metastyle is that you don't like judgment calls, if you like rigid rules that you can apply everywhere to avoid the superficial risks no matter what, then the 'no multiple return statements' rule makes sense." If you ban multiple `return` statements, you will never have a bug due to forgetting a cleanup action on one of them. If you ban hammers, no one will ever hit their thumb with one. – Steve Summit Feb 01 '21 at 15:32
@SteveSummit A "don't duplicate code" standard covers the case of having to run cleanup code. A "single return point" rules is more akin to "don't turn screws to the left when driving them". What if you then have to drive a left-hand screw? – Andrew Henle Feb 01 '21 at 15:37
1

@AndrewHenle Don't worry, I'm not disagreeing with you. I'm just explaining where the one-size-fits-all rules (that I detest, also) come from. (And your "what if" is easily dispensed with: in most of the world, there simply are no left-hand screws.) – Steve Summit Feb 01 '21 at 15:41
@SteveSummit Fair enough. FWIW, I've worked in environments with that "one return point" rule, and it really does lead to convoluted code in many cases. Which is why I think it's probably one of the most useless rules that I've seen in code standards. – Andrew Henle Feb 01 '21 at 15:42
1

@underscore_d Even in small functions, someone who is not the original author can overlook a return. Also, I am not saying that having a single exit point should be a rule. I am just pointing out a potential pitfall. And now this comment section has become irrelevant to the actual answer. – Feb 01 '21 at 18:03

score 0 · Answer 4 · answered Feb 01 '21 at 17:35

At least in a simple case as in your example I suggest to avoid both goto and break and add the condition to the for statement instead. (This may be a matter of personal preference or coding guidelines that might apply to your project.)

bool isnotdigit(string argv)
{
    bool y = false;
    for (int i = 0, n = strlen(argv); (i < n) && !y; i++)
    {
        char c = argv[i];

        /* or alternatively: y = ! isdigit(c); */
        if (! isdigit(c))
        {
            y = true;
        }
    }

    return y;
}

BTW: I suggest to rename the function argument, because argv can be confused with the standard arguments of the main function which are often called argc and argv. Instead of y I would suggest a meaningful name, e.g. wrong_char_found

break vs goto: Which is more appropriate in the following bool function in C?

4 Answers4