Generalising Boolean Functions for more elegant code

Question

I've always been of the belief that if you're copying and pasting code, then there's a more elegant solution. I'm currently implementing a string suffix trie in C++ and have two practically identical functions which only differ in their return statement.

The first function checks if a substring is present, the second if it is a suffix (all strings have the character # added to the end).

substring(string S)

bool Trie::substring(string S){
    Node* currentNode = &nodes[0];                          //root
    for(unsigned int i = 0; i < S.length(); i++){
        int edgeIndex = currentNode->childLoc(S.at(i));
        if(edgeIndex == -1)
            return false;
        else currentNode = currentNode->getEdge(edgeIndex)->getTo();
    }
    return true;
}

suffix(string S)

bool Trie::suffix(string S){
    Node* currentNode = &nodes[0];                          //root
    for(unsigned int i = 0; i < S.length(); i++){
        int edgeIndex = currentNode->childLoc(S.at(i));
        if(edgeIndex == -1)
            return false;
        else currentNode = currentNode->getEdge(edgeIndex)->getTo();
    }
    return (currentNode->childLoc('#') == -1)? false : true;    //this will be the index of the terminating character (if it exists), or -1.
}

How can the logic be generalised more elegantly? Perhaps using templates?

`return (currentNode->childLoc('#') == -1)? false : true;` seems redundant, you can simply use `return currentNode->childLoc('#') != -1;`. That is unless whatever `childLock` returns overloads `operator!=` in an unexpected way. — François Andrieux, Jan 10 '17 at 18:19
@FrançoisAndrieux Thank you for that :) Is there anyway I can use one general function/template or something to avoid repeating the same code? — Luke Collins, Jan 10 '17 at 18:21
In this case, it looks like you can simply wrap the similar section in another function, and have both functions call the new one. I can't imagine a solution involving templates would be more elegant than defining a new function for the common block. — François Andrieux, Jan 10 '17 at 18:22
@FrançoisAndrieux What if I used a switch statement? default: return first thing, case 2: return second. Would that be more elegant? — Luke Collins, Jan 10 '17 at 18:25
You could use a lambda perhaps, passed in to the function as a parameter. — Mikel F, Jan 10 '17 at 18:27
I assume you intend to pass another parameter to specify the behavior. I think it would be less elegant. One function doing two different things based on a 'behavior selection parameter'. But then elegance is entirely subjective, which is why I think it's unlikely you will get a concrete answer to your question. — François Andrieux, Jan 10 '17 at 18:28

score 2 · Accepted Answer · answered Jan 10 '17 at 18:29

2

What I, personally, do in such circumstances, is to move the common code to the function, which I call from multiple places, where I need, said common functionality. Consider this:

Node* Trie::getCurrentNode (string S){
    Node* currentNode = &nodes[0];  
    for(unsigned int i = 0; i < S.length(); i++){
        int edgeIndex = currentNode->childLoc(S.at(i));
        if(edgeIndex == -1)
            return nullptr;
        else currentNode = currentNode->getEdge(edgeIndex)->getTo();
    }
    return currentNode;
}

And then, use it, in all the cases, where you need it:

bool Trie::substring(string S){
    return getCurrentNode (S) != nullptr;
}

bool Trie::suffix(string S){
    Node* currentNode = getCurrentNode(S);
    return currentNode != nullptr && currentNode->childLoc('#') != -1;
    // I decided to simplify the return statement in a way François Andrieux suggested, in the comments. It makes your code more readable.
}

answered Jan 10 '17 at 18:29

Algirdas Preidžius

1,769
3
14
17

I was just about to do that! But I wasn't sure how to declare null pointers in C++. Thank you. – Luke Collins Jan 10 '17 at 18:31
@LukeCollins In C++, null pointers can be declared as `nullptr` (C++11 keyword) or `NULL` (macro inherited from C). Out of these, `nullptr` is preferred if your compiler supports it, since `NULL` is typically defined as `0`, or sometimes `0L` (or, with an especially nice compiler, it _might_ be `(void*) 0`). (`NULL` can cause problems if passed to an overloaded function; `0` is a better match for integral types than for pointers, after all. `nullptr`, on the other hand, is explicitly a pointer literal, and isn't implicitly convertible to any of the integral types.) – Justin Time - Reinstate Monica Jan 10 '17 at 21:30
Case in point: `NULL` is `int` with [GCC 5.1](http://ideone.com/ZpOqU4) and [MSVC 2015](http://rextester.com/ENVX25077), but `long` with [GCC 6.3 and Clang 3.8](http://coliru.stacked-crooked.com/a/82b9b162d25b7908). – Justin Time - Reinstate Monica Jan 10 '17 at 21:37

Generalising Boolean Functions for more elegant code

1 Answers1