Range based for loops on null terminated strings

Question

I sort of assumed that range based for loops would support C-style strings

void print_C_str(const char* str)
{
    for(char c : str)
    {
        cout << c;
    }
}

However this is not the case, the standard [stmt.ranged] (6.5.4) says that range-based-for works in one of 3 possibilities:

The range is an array
The range is a class with a callable begin and end method
There is ADL reachable in an associated namespace (plus the std namespace)

When I add begin and end functions for const char* in the global namespace I still get errors (from both VS12 and GCC 4.7).

Is there a way to get range-based-for loops to work with C style strings?

I tried adding an overload to namespace std and this worked but to my understanding it's illegal to add overloads to namespace std (is this correct?)

@bamboon true but IIRC only for user defined types and this is an overload not a specialization and for a built in type, not a UDT. — Motti, Jan 23 '13 at 10:38
The solution is to not use C strings as strings. std::string is bad enough with it's lack of a notion of encoding. — Cubic, Jan 23 '13 at 10:50
@AlexChamberlain I'm playing around with raw UDLs where C strings are in the standard — Motti, Jan 23 '13 at 10:53

R. Martinho Fernandes · Accepted Answer · 2013-01-23T10:56:04.703

23

If you write a trivial iterator for null-terminated strings, you can do this by calling a function on the pointer that returns a special range, instead of treating the pointer itself as the range.

template <typename Char>
struct null_terminated_range_iterator {
public:
    // make an end iterator
    null_terminated_range_iterator() : ptr(nullptr) {}
    // make a non-end iterator (well, unless you pass nullptr ;)
    null_terminated_range_iterator(Char* ptr) : ptr(ptr) {}

    // blah blah trivial iterator stuff that delegates to the ptr

    bool operator==(null_terminated_range_iterator const& that) const {
        // iterators are equal if they point to the same location
        return ptr == that.ptr
            // or if they are both end iterators
            || is_end() && that.is_end();
    }

private:
    bool is_end() {
        // end iterators can be created by the default ctor
        return !ptr
            // or by advancing until a null character
            || !*ptr;
    }

    Char* ptr;
}

template <typename Char>
using null_terminated_range = boost::iterator_range<null_terminated_range_iterator<Char>>;
// ... or any other class that aggregates two iterators
// to provide them as begin() and end()

// turn a pointer into a null-terminated range
template <typename Char>
null_terminated_range<Char> null_terminated_string(Char* str) {
    return null_terminated_range<Char>(str, {});
}

And usage looks like this:

for(char c : null_terminated_string(str))
{
    cout << c;
}

I don't think this loses any expressiveness. Actually, I think this one is clearer.

edited Jan 23 '13 at 10:56

answered Jan 23 '13 at 10:47

R. Martinho Fernandes

228,013
71
433
510

4

+1, it *is* cleaner because it correctly works with both `char*` and `char[]`. The problem with `char[]` is that it works natively but does the wrong thing (it’s treated as any C-array, rather than a zero-terminated string, and consequently is one element too long. – Konrad Rudolph Jan 23 '13 at 11:37
+1 Out of curiousity, is there a reason this solution would be preferred to `for (char c: std::string(str))` ? This, to me, seems an obvious solution. As nobody has posted it as an answer I can only guess there is something I am missing. In either this solution you posted or using `std::string` something additional needs constructed to perform the iteration. – hmjd Jan 23 '13 at 13:48
2

@hmjd Yes, `for (char c: std::string(str))` is an alternative. The advantage of this approach over std::string is that this abstraction has a very low runtime cost: the extra objects that are constructed are extremely cheap, unlike std::string which might involve dynamic allocation and will copy the whole string into its own buffer. Basically, while I wouldn't sneer at seeing std::string used for this (unless profiling proved it to be a performance issue), this provides exactly the functionality desired, iteration, while std::string provides unneeded functionality and that is what costs. – R. Martinho Fernandes Jan 23 '13 at 14:01
@R.MartinhoFernandes, the performance aspect was only possible reason I could think of. Cheers. – hmjd Jan 23 '13 at 14:03
@R.MartinhoFernandes: that's very pretty, but I'll bet you that the less clever solution of using a proxy class to define begin() and end() where `begin(s)` is just `s` and `end(s)` is `s+strlen(s)` will turn out to be just as fast if not faster. My theory is based on my suspicion that strlen() is highly optimized to not require a byte comparison at every byte, so the overhead of doing it should be roughly equal to the overhead of doing two tests (or maybe three) instead of just one on each loop iteration. Anyway, it's certainly less work since it doesn't require a custom iterator at all. – rici Jan 23 '13 at 20:40

score 4 · Answer 2 · edited May 23 '17 at 11:50

A possible workaround is to wrap the null terminated string in another type. The simplest implementation is as follows (it's less performant than R. Martinho Fernandes's suggestion since it calls strlen but it's also significantly less code).

class null_terminated_range {
    const char* p:
public:
    null_terminated_range(const char* p) : p(p) {}
    const char * begin() const { return p; }
    const char * end() const { return p + strlen(p); }
};

Usage:

for(char c : null_terminated_range(str) )

score 2 · Answer 3 · answered Jan 23 '13 at 10:47

A C-string is not an array, it is not a class that has begin/end members, and you won't find anything by ADL because the argument is a primitive. Arguably, this should be plain unqualified lookup, with ADL, which would find a function in the global namespace. But, given the wording, I'm thinking that it is not possible.

Range based for loops on null terminated strings

3 Answers3

Linked

Related