2
template<typename ForwardIterator, typename StringType>
inline bool starts_with(ForwardIterator begin, ForwardIterator end, const StringType& target)
{
    assert(begin < end);
    if (std::distance(std::begin(target), std::end(target)) > std::distance(begin, end))
    {
        return false;
    }
    return std::equal(std::begin(target), std::end(target), begin);
}

This fails because std::end returns one past the '\0' if StringType is a string literal, not the '\0'. (In this respect, it's similar to the range based for loop inconsistency ) How does one work around this?

Community
  • 1
  • 1
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • How about a typetrait for your `StringType`? – Kerrek SB Aug 03 '11 at 23:00
  • Make `target` an iterator-denoted range as well, let the client deal with it. – GManNickG Aug 03 '11 at 23:08
  • @GMan: That's even worse than forcing the user to cast everything to `std::string`s – Billy ONeal Aug 03 '11 at 23:09
  • ...how so? You have two contradictory desires: Let the client make the choice of how their data is passed into the function, or force the data passed into the function conform to a requirement. Pick one. – GManNickG Aug 03 '11 at 23:11
  • @GMan: Of the two, I'm picking the second option. The `StringType` argument needs to be a valid string, whether that's a `std::basic_string`, a `std::vector`, a `MyBlahStringType`, or a plain old string literal. If you force the client to pass a range for the last argument then they can't use string literals anymore. – Billy ONeal Aug 03 '11 at 23:12
  • @Billy: A plain old string literal isn't interchangeable, storage-wise/access-wise, with the other types; so you've failed to make a consistent requirement. Again: Require a strict interface to your single type, or remove the strict requirement and allow the client to specify the range. EDIT: Sure they can, any solution to your question can trivially be done by the client. – GManNickG Aug 03 '11 at 23:14
  • @GMan: I don't know what you mean by your edit. `starts_with(begin, end, "Hello")` wouldn't ever work if the client has to specify a pair of iterators for the searched for string. – Billy ONeal Aug 03 '11 at 23:18
  • @Billy: `start_with(begin, end, std::begin("Hello"), std::end("Hello") - 1)`. I much rather like the `literal` type solution in the question you linked to. – GManNickG Aug 03 '11 at 23:27
  • @GMan: That example has undefined behavior. You can't be sure the two literals share the same memory space. Even if you could, forcing clients to specify every literal twice is more of a pain than just requiring a cast to `std::string`. – Billy ONeal Aug 03 '11 at 23:32
  • Correctness nit: `assert(begin < end)` means that `ForwardIterator` should really be `RandomAccessIterator`. Efficiency nit: if you actually use this function with a forward or bidirectional range, you end up walking the range twice (once in `distance`, once in `equal`). This can be avoided if you write your own loop or use `std::lexicographical_compare`. – James McNellis Aug 03 '11 at 23:46
  • Also: if you want this to work with string literals, presumably you also want it to work with arbitrary C strings (`char const*`), no? `std::begin` and `std::end` don't support C strings, and rightly so (they should have constant time complexity). You have to write your own function to handle this. – James McNellis Aug 04 '11 at 00:45

3 Answers3

2

Pass a proper std::string instead.

String literals don't have their own "type"; your input data could be considered to be mangled, essentially.

You could specialise/overload for char const*, which almost universally will be null-terminated.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
2

Your API is a template API. Use template specialization to create specialized versions for char* and actual iterator types.

Also, there is a reason why the C++ standard algorithms deal only in iterators and not containers (like StringType).

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • That's what I've done for now. But it certainly seems like a hack to have to specialize any string manipulation algorithm for both types. Boost's `string_algo` library does just fine using `boost::range`, which handles this correctly; but I can't use boost. – Billy ONeal Aug 03 '11 at 23:07
  • 1
    @Billy: Then look at how Boost handles it. – Nicol Bolas Aug 03 '11 at 23:11
  • That has legal implications I don't want to get into. – Billy ONeal Aug 03 '11 at 23:13
  • 1
    @Billy: Legal implications? The Boost License allows you to use it pretty much without restriction; all you have to do is carry along the license file if it's a source distro. Also, I didn't say to _copy_ what Boost does; I said to look at it. – Nicol Bolas Aug 03 '11 at 23:17
  • Yes, legal implications. I've been expressly forbidden use of boost by the legal team. I don't make the rules, I just have to follow them. I'm not sure I'd want to depend on something like boost for something as trivial as "this string starts with that one" anyway. – Billy ONeal Aug 03 '11 at 23:19
  • @Billy: Understanding how Boost works isn't even comparable to using Boost. – GManNickG Aug 03 '11 at 23:26
  • @Billy: Also, going by your (and others) "it's not a string" argument from the other answer, specialization isn't a hack. You're explicitly saying "everything that matches this type will be treated as a string", there's no hack there. This is the right answer, in that case. – GManNickG Aug 03 '11 at 23:28
  • 1
    @Billy: legal implications? So it's OK for you to ask me for suggestions, and take my suggestions away to write your own code, but it's not OK for you to look at the Boost source for suggestions, and take those suggestions away to write your own code? Has your legal department considered the risk that I could be the author of Boost.Range, and just dump some code from there into an answer here? I realise that the legal department's rules are binding even when misguided and/or ineffective, but it does seem very odd that you can't look at Boost source, but you can let us quote it at you. – Steve Jessop Aug 04 '11 at 00:40
  • @Steve: Come to think of it I have not explicitly asked them about here. Honestly, I'm not looking to implement any of this in what I'm working on now for that exact reason, though I am curious if I see similar situations in the future how I would tackle something like this. (For the record, I didn't use any of the code from here, I just was curious if there was a better way of handling it than the template specialization which was the solution I settled on) – Billy ONeal Aug 04 '11 at 01:41
2

How about making a little trait class for your string template parameter:

template <typename TString>
struct StringBounds
{
  typedef typename TString::const_iterator citerator;
  static citerator Begin(const TString & s) const { return std::begin(s); }
  static citerator End  (const TString & s) const { return std::end(s); }
};

template <typename TChar, size_t N>
struct StringBounds<TChar[N]>
{
  typedef const TChar * citerator;
  static citerator Begin(const TChar(&s)[N]) const { return s; }
  static citerator End (const TChar(&s)[N]) const { return s + N - 1; }
};

Usage:

std::equal(StringBounds<StringType>::Begin(target), StringBounds<StringType>::End(target), begin)
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • What happens if they pass a `const char*` instead of directly passing a literal? – Nicol Bolas Aug 03 '11 at 23:11
  • 1
    Except you just messed up all the non-literal character arrays. – GManNickG Aug 03 '11 at 23:11
  • @Nicol: Pointers which aren't actually compile-time arrays don't work, for obvious reasons. GMan: Could you give an example? – Kerrek SB Aug 03 '11 at 23:14
  • @Kerrek: `const char prefix[] = {'A', 'Z', 'C'};`. You strip away my `C`. – GManNickG Aug 03 '11 at 23:16
  • @Kerrek: GMan means that if someone passes `{ 'H', 'e', 'l', 'l', 'o' }` the `o` would get skipped. – Billy ONeal Aug 03 '11 at 23:16
  • 3
    @GMan: That is not a string. Not in C or C++; that's just an array of characters. And passing someone a `char*` that is not null-terminated is... rude. I don't see a problem with clipping the end off of someone making such perverse use of a string API. – Nicol Bolas Aug 03 '11 at 23:19
  • @GMan: Agreed with Nicol, that's not a *string*, that's just an array. I don't see how you could even come by something like that in the context of string processing. Indeed, that isn't covered by my trait class, but I'm not sure that's an argument against it... – Kerrek SB Aug 03 '11 at 23:24
  • @Nicol: My point was that the API could trivially support any sort of range, but this solution necessarily disallows my example. I don't see the justification in the restriction. – GManNickG Aug 03 '11 at 23:25
  • The trouble with treating char arrays as ranges in this context of course is that there's absolutely no way to tell the difference between a nul-terminated string literal, and a char array that's supposed to be treated as a range containing char data, that just so happens to contain embedded nuls, and one of those nuls just so happens to be at the end. You have to restrict somewhere. I suppose GMan's array could be tolerated by checking whether the last char is a nul before excluding it. But I think it's confusing if `{0, 'A'}` yields a 2-char string but `{'A', 0}` yields a 1-char string. – Steve Jessop Aug 04 '11 at 00:31
  • @Steve: My suggestion to countermand this: You simply declare that the interface must only be used with either a class type or with a string literal, but never with an explicit array, and everything else is unspecified behaviour. :-) – Kerrek SB Aug 04 '11 at 00:44
  • An array of characters is a string, even if it's not a null-terminated one. – Lightness Races in Orbit Aug 04 '11 at 06:34
  • @jalf: What else would "string" mean? Null-terminated (C-style) `char` arrays are one type of string; `std::string`s are another; I see no reason not to consider a fixed/known-length `char` array a string also. – Lightness Races in Orbit Aug 04 '11 at 09:44
  • Well, it's simple: If both null-terminated and unterminated char arrays are to count as "strings", then you cannot have the simple, automagic interface you want. On the other hand, if among the raw C constructs only string literals are to count, then you can use the -1 construction. It's a decision that has to be made, and documented. – Kerrek SB Aug 04 '11 at 09:48
  • @Tomalak: A string is generally a sequence of characters that, together, constitute a text snippet. By convention, a null-terminated character array is used in this way, and `std::string` is also intended to use this meaning. If given a non-null-terminated array of chars, there's little reason to expect it to constitute a string of text. I'd typically expect it to be a simple collection of characters instead. In any case, I fail to see the reason why a string API should be expected to natively handle a contrived and nonidiomatic string representation like that. – jalf Aug 04 '11 at 12:24
  • @jalf: Don't get me wrong; I'm not convinced that an API should be expected to do that either. I just probably wouldn't go so far as to discount non-terminated `char` arrays from being someone's representation of a text snippet. – Lightness Races in Orbit Aug 04 '11 at 15:07