10

Consider this code (VS2008):

void WordManager::formatWords(std::string const& document)
{
    document_ = document;
    unsigned int currentLineNo = 1;

    size_t oldEndOfLine = 0;
    size_t endOfLine    = document_.find('\n');
    while(endOfLine != std::string::npos)
    {
        std::string line = document_.substr(oldEndOfLine, (endOfLine - oldEndOfLine));
        if(line.size() < 2)
        {
            oldEndOfLine = endOfLine + 1;
            endOfLine    = document_.find('\n', oldEndOfLine);
            continue;
        }

        std::vector<std::string> words = Utility::split(line);
        for(unsigned int i(0); i < words.size(); ++i)
        {
            if(words[i].size() < 2)
                continue;
            Utility::trim(words[i], WordManager::delims);
            Utility::normalize(words[i], WordManager::replace, WordManager::replaceWith);

            if(ruleOne(words[i]) && ruleTwo(words[i]))
            {
                std::set<Word>::iterator sWIter(words_.find(Word(words[i])));

                if(sWIter == words_.end())
                    words_.insert(Word(words[i])).first->addLineNo(currentLineNo);
                else
                    sWIter->addLineNo(currentLineNo);
            }
        }
        ++currentLineNo;

        oldEndOfLine = endOfLine + 1;
        endOfLine    = document_.find('\n', oldEndOfLine);
    }
}

If it's important: this is code from a homework assignment used to filter out and modify words in a document. document holds the document (previously read from file)

I wish to introduce a malicious goto because I think it's actually cleaner in this case like so:

void WordManager::formatWords(std::string const& document)
{
    document_ = document;
    unsigned int currentLineNo = 1;

    size_t oldEndOfLine = 0;
    size_t endOfLine    = document_.find('\n');
    while(endOfLine != std::string::npos)
    {
        std::string line = document_.substr(oldEndOfLine, (endOfLine - oldEndOfLine));
        // HERE!!!!!!
        if(line.size() < 2)
            goto SkipAndRestart;

        std::vector<std::string> words = Utility::split(line);
        for(unsigned int i(0); i < words.size(); ++i)
        {
            if(words[i].size() < 2)
                continue;
            Utility::trim(words[i], WordManager::delims);
            Utility::normalize(words[i], WordManager::replace, WordManager::replaceWith);

            if(ruleOne(words[i]) && ruleTwo(words[i]))
            {
                std::set<Word>::iterator sWIter(words_.find(Word(words[i])));

                if(sWIter == words_.end())
                    words_.insert(Word(words[i])).first->addLineNo(currentLineNo);
                else
                    sWIter->addLineNo(currentLineNo);
            }
        }

SkipAndRestart:
        ++currentLineNo;

        oldEndOfLine = endOfLine + 1;
        endOfLine    = document_.find('\n', oldEndOfLine);
    }
}

Whether or not this is a good design choice is irrelevant at this moment. The compiler complains error C2362: initialization of 'words' is skipped by 'goto SkipAndRestart'

I don't understand this error. Why is it important, and an error, that the words initialization is skipped? That's exactly what I want to happen, I don't want it to do more work, just restart the bloody loop. Doesn't the continue macro do more or less the exact same thing?

IAE
  • 2,213
  • 13
  • 37
  • 71
  • I would have thought this would just be a warning, not an error. What happens if you just use `break` instead of the goto ? – Paul R Jan 23 '11 at 23:28
  • 6
    Most people would probably disagree that the `goto` version is "cleaner"! – Oliver Charlesworth Jan 23 '11 at 23:28
  • @Oli: I know, that's why I said the actual design of the thing is irrelevant; I don't want to start a flame war :P @Paul: Compiles. – IAE Jan 23 '11 at 23:31
  • if you believe `goto` is cleaner, why do you describe it as "malicious"? –  Jan 23 '11 at 23:31
  • @bstn: Because it is malicious. That's why I generally agree with Oli's statement that it usually doesn't make code cleaner. I don't know why I prefer it in this case. Perhaps for the syntactic sugar? Living on the wild side of danger? – IAE Jan 23 '11 at 23:33

4 Answers4

22

You are skipping over the construction of the words array:

    if(line.size() < 2)
        goto SkipAndRestart;
    std::vector<std::string> words = Utility::split(line);
    // ...
SkipAndRestart:

You could have used words after the SkipAndRestart: label, which would have been a problem. You don't use it in your case, but the words variable won't be destructed until the end of the scope in which it is introduced, so as far as the compiler is concerned, the variable is still in use at the point of the label.

You can avoid this by putting words inside its own scope:

    if(line.size() < 2)
        goto SkipAndRestart;
    {
        std::vector<std::string> words = Utility::split(line);
        // ...
    }
SkipAndRestart:

Note that the continue statement jumps to the end of the loop, at a point where you actually can't put a label. This is a point after the destruction of any local variables inside the loop, but before the jump back up to the top of the loop.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • My question was more geared as to why this is an actual problem. – IAE Jan 23 '11 at 23:34
  • 1
    I just added a bit about the time of destruction that will help explain that. – Greg Hewgill Jan 23 '11 at 23:35
  • 1
    If you are going to do that, you might as well invert the condition and remove the `goto` altogether. That will enter the block of code only when `line.size() >= 2`, which is the intended behavior – David Rodríguez - dribeas Jan 23 '11 at 23:47
  • 1
    "as far as the compiler is concerned, the variable is still in use" - it actually is used after the label, in that if the `goto` was allowed, then (you'd expect) the destructor of the vector would be called at scope-end, on memory that was never constructed. Bang. – Steve Jessop Jan 24 '11 at 00:01
2

With that goto, you are skipping the line:

std::vector<std::string> words = Utility::split(line);

This isn't just skipping the re-initilisation, it's skipping the creation. Because that variable is defined inside the loop, it's created freshly each time the loop iterates. If you skip that creation, you can't use it.

If you want it created once, you should move that line outside of the loop.

I'll refrain from my first inclination, which is to tell you that using goto and cleaner in the same sentence is usually wrong - there are situations where goto is better but I'm not sure this is one of them. What I will tell you is that, if this is homework, goto is a bad idea - any educator will frown upon the use of goto.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • The homework is already turned in. I'm just experimenting here. My teacher DOES frown and accuses me of mucking up good C++ code. – IAE Jan 23 '11 at 23:37
1

As always when someone thinks goto would code more readable, refactoring to use (inline) functions is at least as good and without goto:

// Beware, brain-compiled code ahead!
inline void WordManager::giveThisAMeaningfulName(const std::string& word, std::string__size_type currentLineNo)
{
    Utility::trim(word, WordManager::delims);
    Utility::normalize(word, WordManager::replace, WordManager::replaceWith);
    if(ruleOne(word) && ruleTwo(word))
    {
        std::set<Word>::iterator sWIter(words_.find(Word(word)));
        if(sWIter == words_.end())
            words_.insert(Word(word)).first->addLineNo(currentLineNo);
        else
            sWIter->addLineNo(currentLineNo);
    }
}

void WordManager::formatWords(std::string const& document)
{
    document_ = document;
    std::string__size_type currentLineNo = 1;

    std::string line;
    while(std::getline(line))
        if(line.size() > 1)
        {
          std::vector<std::string> words = Utility::split(line);
          if(word.size() > 1)
              for(unsigned int i(0); i < words.size(); ++i)
                  giveThisAMeaningfulName(words[i], currentLineNo++);
        }
}

I haven't bothered trying to understand what that inner loop does, so I leave it to you to give it a meaningful name. Note that, once you have given it a name, I don't need to understand its algorithm in order to know what it does, because the name says it all.)

Note that I have replace your hand-written line-extraction by std::getline(), which made the code even smaller.

sbi
  • 219,715
  • 46
  • 258
  • 445
  • This is interesting. I personally thought my code is very readable and easy to understand! What you have written is a valid option, but without understanding the semantics of the class, you messed up some parts of the function. That's my fault though, just goes to show that I need to write even more legible code :) – IAE Jan 24 '11 at 13:11
  • 1
    @SoulBeaver: I didn't put much effort into actually understanding this and I knew why I added that disclaimer at the top. `:)` Anyway, what I tried to get across is this: `goto` is an hard-to-understand, unstructured way to jump, while several easy-to-understand, structured alternatives exist: `return`, `break`, `if`... Whenever you're temped to use `goto`, look at breaking up the code in question to be able to use those instead. Using them, I was to condense the code to the point where SO didn't add a vertical scrollbar anymore. And the less code, the easier it is to understand. – sbi Jan 24 '11 at 15:46
  • I'm programming in C++ for almost twenty years now and have _never_ used `goto`. I have yet to find a case were it would make my code clearer than other refactoring options. Namely C++' ability to not to pay for functions (by inlining), allowing me to take almost arbitrary code snippets and slap a name on them (the name of the function I'm putting them in), which is much better than commenting, plus the explicit dependency control this gains (you have to pass required data explicitly to those functions, and if it's too many, it makes you think what you're doing wrong) is a very valuable tool. – sbi Jan 24 '11 at 15:59
1

Another solution would be just to declare 'words' before the 'if' instruction (in case you really can't populate the vector above the 'if') and populate the vector later

// HERE!!!!!!
std::vector<std::string> words;
if(line.size() < 2)
     goto SkipAndRestart;
words = Utility::split(line);
theTAG
  • 21
  • 5