3
std::string Concatenate(const std::string& s1,
                        const std::string& s2,
                        const std::string& s3,
                        const std::string& s4,
                        const std::string& s5)
{
    return s1 + s2 + s3 + s4 + s5;
}

By default, return s1 + s2 + s3 + s4 + s5; may be equivalent to the following code:

auto t1 = s1 + s2; // Allocation 1
auto t2 = t1 + s3; // Allocation 2
auto t3 = t2 + s4; // Allocation 3

return t3 + s5; // Allocation 4

Is there an elegant way to reduce the allocation times to 1? I mean keeping return s1 + s2 + s3 + s4 + s5; not changed, but the efficiency is improved automatically. If it is possible, it can also avoid the programmer misusing std::string::operator +.

Does ref-qualifier member functions help?

xmllmx
  • 39,765
  • 26
  • 162
  • 323
  • 5
    *"By default, return s1 + s2 + s3 + s4 + s5; is equivalent to the following code:"* Are you sure? There are overloads of `operator+` which take an rvalue reference, therefore the first temporary result from `s1 + s2` should be reused. There's probably more than one allocation, but it's probably not equivalent to the code you show. – dyp Sep 10 '14 at 00:20
  • @dyp, I just provide one possible solution. It obviously depends on the library implementor. – xmllmx Sep 10 '14 at 00:24
  • Yes, it's a QoI, but I don't see which library implementer would implement the mandatory `basic_string operator+(basic_string&& lhs, const basic_string& rhs)` by copying the first argument. – dyp Sep 10 '14 at 00:25
  • 1
    You could of course throw some expression templates at the problem, to get something like `expr_templ(s1) + s2 + s3 + ..` resulting in 1 allocation. – dyp Sep 10 '14 at 00:38

6 Answers6

12

The premise of the question that:

s1 + s2 + s3 + s4 + s5 + ... + sn

will require n allocations is incorrect.

Instead it will require O(Log(n)) allocations. The first s1 + s1 will generate a temporary. Subsequently a temporary (rvalue) will be the left argument to all subsequent + operations. The standard specifies that when the lhs of a string + is an rvalue, that the implementation simply append to that temporary and move it out:

operator+(basic_string<charT,traits,Allocator>&& lhs,
          const basic_string<charT,traits,Allocator>& rhs);

Returns: std::move(lhs.append(rhs))

The standard also specifies that the capacity of the string will grow geometrically (a factor between 1.5 and 2 is common). So on every allocation, capacity will grow geometrically, and that capacity is propagated down the chain of + operations. More specifically, the original code:

s = s1 + s2 + s3 + s4 + s5 + ... + sn;

is actually equivalent to:

s = s1 + s2;
s += s3;
s += s4;
s += s5;
// ...
s += sn;

When geometric capacity growth is combined with the short string optimization, the value of "pre-reserving" the correct capacity is limited. I would only bother doing that if such code actually shows up as a hot spot in your performance testing.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • I'm sorry, where is the complexity guarantee for `std::basic_string::append`? I don't see one at first glance. – Yakk - Adam Nevraumont Sep 10 '14 at 01:20
  • @Yakk: It is ridiculously sloppily (and not quite actually) specified. In chapter 23, it says in the requirements table that `string::push_back` is amortized constant complexity, which implies the geometric capacity growth. It nowhere specifies the `append` complexity. One has to infer that a sane implementation would use similar logic for `append` and `push_back`. If `string` were specified today, I have no doubt that the quality of the spec would be much higher. This is probably worthy of a new issue: http://cplusplus.github.io/LWG/lwg-active.html#submit_issue – Howard Hinnant Sep 10 '14 at 04:22
  • an implementation is then free to allocate exactly enough space, for example. So, what do thd clang, gcc and msvc compiler strings do on append I wonder? @xmllmx might be interested. – Yakk - Adam Nevraumont Sep 10 '14 at 04:27
  • libc++ roughly doubles capacity. gcc is here: http://codepad.org/2fWvmZrL Don't have access to VS++ to run the test. – Howard Hinnant Sep 10 '14 at 04:31
7
std::string combined;
combined.reserve(s1.size() + s2.size() + s3.size() + s4.size() + s5.size());
combined += s1;
combined += s2;
combined += s3;
combined += s4;
combined += s5;
return combined;
Benjamin Lindley
  • 101,917
  • 9
  • 204
  • 274
  • 2
    @xmllmx: I have no idea what you mean by trivial. – Benjamin Lindley Sep 10 '14 at 00:07
  • 1
    @xmllmx, This works fine, but I would suggest using a variadic template or generic lambda to take any number of strings. – chris Sep 10 '14 at 00:08
  • @BenjaminLindley, I mean keeping `return s1 + s2 + s3 + s4 + s5;` not changed, but the efficiency is improved automatically. If it is possible, it can also avoid the programmer misusing `std::string::operator +`. – xmllmx Sep 10 '14 at 00:11
  • @xmllmx: Are you willing to change the function signature? i.e. To something that takes arguments other than `std::string` (but will still work with `std::string`s passed to it) – Benjamin Lindley Sep 10 '14 at 00:13
4

There is no engineering like over engineering.

In this case, I create a type string_builder::op<?> that reasonably efficiently collects a pile of strings to concatenate, and when cast into a std::string proceeds to do so.

It stores copies of any temporary std::strings provided, and references to longer-lived ones, as a bit of paranoia.

It ends up reducing to:

std::string retval;
retval.reserve(the right amount);
retval+=perfect forwarded first string
...
retval+=perfect forwarded last string
return retval;

but it wraps it all in lots of syntaxtic sugar.

namespace string_builder {
  template<class String, class=std::enable_if_t< std::is_same< String, std::string >::value >>
  std::size_t get_size( String const& s ) { return s.size(); }
  template<std::size_t N>
  constexpr std::size_t get_size( const char(&)[N] ) { return N; }
  template<std::size_t N>
  constexpr std::size_t get_size( char(&)[N] ) { return N; }
  std::size_t get_size( const char* s ) { return std::strlen(s); }
  template<class Indexes, class...Ss>
  struct op;
  struct tuple_tag {};
  template<size_t... Is, class... Ss>
  struct op<std::integer_sequence<size_t, Is...>, Ss...> {
    op() = default;
    op(op const&) = delete;
    op(op&&) = default;
    std::tuple<Ss...> data;
    template<class... Tuples>
    op( tuple_tag, Tuples&&... ts ): data( std::tuple_cat( std::forward<Tuples>(ts)... ) ) {}
    std::size_t size() const {
      std::size_t retval = 0;
      int unused[] = {((retval+=get_size(std::get<Is>(data))), 0)..., 0};
      (void)unused;
      return retval;
    }
    operator std::string() && {
      std::string retval;
      retval.reserve( size()+1 );
      int unused[] = {((retval+=std::forward<Ss>(std::get<Is>(data))), 0)..., 0};
      (void)unused;
      return retval;
    }
    template<class S0>
    op<std::integer_sequence<size_t, Is..., sizeof...(Is)>, Ss..., S0>
    operator+(S0&&s0)&& {
      return { tuple_tag{}, std::move(data), std::forward_as_tuple( std::forward<S0>(s0) ) };
    }
    auto operator()()&& {return std::move(*this);}
    template<class T0, class...Ts>
    auto operator()(T0&&t0, Ts&&... ts)&&{
      return (std::move(*this)+std::forward<T0>(t0))(std::forward<Ts>(ts)...);
    }
  };
}
string_builder::op< std::integer_sequence<std::size_t> >
string_build() { return {}; }

template<class... Strings>
auto
string_build(Strings&&...strings) {
  return string_build()(std::forward<Strings>(strings)...);
}

and now we get:

std::string Concatenate(const std::string& s1,
                        const std::string& s2,
                        const std::string& s3,
                        const std::string& s4,
                        const std::string& s5)
{
  return string_build() + s1 + s2 + s3 + s4 + s5;
}

or more generically and efficiently:

template<class... Strings>
std::string Concatenate(Strings&&...strings)
{
  return string_build(std::forward<Strings>(strings)...);
}

there are extraneous moves, but no extraneous allocations. And it works with raw "strings" with no extra allocations.

live example

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
1

You can use code like:

std::string(s1) + s2 + s3 + s4 + s5 + s6 + ....

This will allocates a single unnamed temporary (copy of the first string), and then append each of the other strings to it. A smart optimizer could optimize this into the same code as the reserve+append code others have posted, as all these functions are generally inlineable.

This works by using the move-enhanced version of operator+, which is defined as (roughly)

std::string operator+(std::string &&lhs, const std::string &rhs) {
    return std::move(lhs.append(rhs));
}

combined with RVO, it means that no additional string objects need to be created or destroyed.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • 4
    I don't quite see why this should be used instead of `s1 + s2 + ...`. – dyp Sep 10 '14 at 00:22
  • 1
    If there is any difference in the number of allocations between this and the OP's original code, I would expect this to have *more* allocations (1 more, to be precise), not less. – Benjamin Lindley Sep 10 '14 at 00:25
  • @dyp, I think "A smart optimizer could optimize this into the same code as the reserve+append code others have posted" is useful to me. – xmllmx Sep 10 '14 at 00:27
  • @xmllmx: Only if actually correct, which it may be but seems somewhat speculative. – Jerry Coffin Sep 10 '14 at 00:29
  • RVO only works if the temporary hasn't been bound to a reference. `operator+` binds it to a reference. So, an additional string (the return value) must be created, initialized from the prvalue result of the `+` expression. – dyp Sep 10 '14 at 00:35
  • 2
    @dyp: g++ 4.8 compiles the above code to a single call to the copy ctor (to create a single unnamed temporary), followed by 5 (or more) call to `append`. It doesn't seem to be able to inline the `append` calls for some reason, however. – Chris Dodd Sep 10 '14 at 00:43
  • There is no observable behaviour in the move ctor, I guess. Btw, I wouldn't trust libstdc++'s `std::string` implementation, it uses COW strings, which are noncompliant. – dyp Sep 10 '14 at 00:46
0

After some thought, I think it might be worth at least considering a slightly different approach.

std::stringstream s;

s << s1 << s2 << s3 << s4 << s5;
return s.str();

Although it doesn't guarantee only a single allocation, we can expect a stringstream to be optimized for accumulating relatively large amounts of data, so chances are pretty good that (unless the input strings are huge) it will keep the number of allocations quite minimal.

At the same time, especially if the individual strings are reasonably small, it certainly avoids the situation we expect with something like a + b + c + d where (at least in C++03) we expect to see a number of temporary objects created and destroyed in the process of evaluating the expression. In fact, we can typically expect this to get pretty much the same kind of result we'd expect from something like expression templates, but with a lot less complexity.

There is something of a downside though: iostreams (in general) have enough baggage such a associated locales that especially if the strings are small, there could be more overhead in creating the stream than we save in individual allocations.

With a current compiler/library, I'd expect the overhead of creating a stream to make this slower. With an older implementation, I'd have to test to have any certainty at all (and I don't have an old enough compiler handy to do so).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
0

How about this:

std::string Concatenate(const std::string& s1,
                        const std::string& s2,
                        const std::string& s3,
                        const std::string& s4,
                        const std::string& s5)
{
    std::string ret;
    ret.reserve(s1.length() + s2.length() + s3.length() + s4.length() + s5.length());
    ret.append(s1.c_str());
    ret.append(s2.c_str());
    ret.append(s3.c_str());
    ret.append(s4.c_str());
    ret.append(s5.c_str());
    return ret;
}

There are two allocations, one really small to construct std::string another reserves memory for data.

ST3
  • 8,826
  • 3
  • 68
  • 92