10

One of the C++0x improvements that will allow to write more efficient C++ code is the unique_ptr smart pointer (too bad, that it will not allow moving through memmove() like operations: the proposal didn't make into the draft).

What are other performance improvements in upcoming standard? Take following code for example:

vector<char *> v(10,"astring");
string concat = accumulate(v.begin(),v.end(), string(""));

The code will concatenate all the strings contained in vector v. The problem with this neat piece of code is that accumulate() copies things around, and does not use references. And the string() reallocates each time plus operator is called. The code has therefore poor performance compared to well optimized analogical C code.

Does C++0x provide tools to solve the problem and maybe others?

  • 1
    Here is some other answer about the topic: http://stackoverflow.com/questions/637695/how-efficient-is-stdstring-compared-to-null-terminated-strings/637737#637737 – Johannes Schaub - litb Jun 10 '09 at 23:08

4 Answers4

14

Yes C++ solves the problem through something called move semantics.

Basically it allows for one object to take on the internal representation of another object if that object is a temporary. Instead of copying every byte in the string via a copy-constructor, for example, you can often just allow the destination string to take on the internal representation of the source string. This is allowed only when the source is an r-value.

This is done through the introduction of a move constructor. Its a constructor where you know that the src object is a temporary and is going away. Therefore it is acceptable for the destination to take on the internal representation of the src object.

The same is true for move assignment operators.

To distinguish a copy constructor from a move constructor, the language has introduced rvalue references. A class defines its move constructor to take an rvalue reference which will only be bound to rvalues (temporaries). So my class would define something along the lines of:

 class CMyString
 {
 private:
     char* rawStr;
 public:

     // move constructor bound to rvalues
     CMyString(CMyString&& srcStr) 
     {
         rawStr = srcStr.rawStr
         srcStr.rawStr = NULL;             
     }

     // move assignment operator 
     CMyString& operator=(CMyString&& srcStr) 
     {
         if(rawStr != srcStr.rawStr) // protect against self assignment
         {
             delete[] rawStr;
             rawStr = srcStr.rawStr
             srcStr.rawStr = NULL;
         }
         return *this;
     }

     ~CMyString()
     {
         delete [] rawStr;
     }
 }

Here is a very good and detailed article on move semantics and the syntax that allows you to do this.

Motti
  • 110,860
  • 49
  • 189
  • 262
Doug T.
  • 64,223
  • 27
  • 138
  • 202
  • 3
    This makes permutation operations like sorting and rotating collections of collections much faster. I suspect the most likely example would be sorting a collection of strings. – Brian Jun 10 '09 at 13:59
  • 1
    This is also a very nice explanation of rvalues references: https://www.boostpro.com/trac/wiki/BoostCon09/RValue101 – Johannes Schaub - litb Jun 10 '09 at 16:08
  • Extending (or converting) all containers and algorithms to move semantics looks like large task, I wonder how are things going and if compatibility with actual STL will be preserved without trading efficiency. –  Jun 10 '09 at 22:36
  • @litb: the link asks for a username and password :( – sstock Aug 05 '09 at 14:04
7

One performance-boost will be generalized constant expressions,which is introduced by the keyword constexpr.

constexpr int returnSomething() {return 40;}

int avalue[returnSomething() + 2]; 

This is not legal C++ code, because returnSomething()+2 is not a constant expression.

But by using the constexpr keyword, C++0x can tell the compiler that the expression is a compile-time constant.

Silfverstrom
  • 28,292
  • 6
  • 45
  • 57
1
vector<string> v(10, "foo");
string concat = accumulate(v.begin(), v.end(), string(""));

This example is simply bad programming, in any C++ standard. It is equivalent to this:

string tmp;
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp
tmp = tmp + "foo"; //copy tmp, append "foo", then copy the result back into tmp

C++11 move semantics will only take care of the "copy the result back into tmp" part of the equation. The initial copies from tmp will still be copies. It is a classic Schlemiel the Painter's algorithm, but even worse than the usual example using strcat in C.

If accumulate just used += instead of + and = then it would've avoided all those copies.

But C++11 does give us a way to do better, while remaining succinct, using range for:

string concat;
for (const string &s : v) { concat += s; }

EDIT: I suppose a standard library vendor could choose to implement accumulate with move on the operand to +, so tmp = tmp + "foo" would become tmp = move(tmp) + "foo", and that would pretty much solve this problem. I'm not sure if such an implementation would be strictly conforming. Neither GCC, MSVC, nor LLVM do this in C++11 mode. And as accumulate is defined in <numeric> one might assume it is only designed for use with numeric types.

EDIT 2: As of C++20 accumulate has been redefined to use move as in the suggestion of my previous edit. I still consider it a questionable abuse of an algorithm that was only ever designed for use with arithmetic types.

Oktalist
  • 14,336
  • 3
  • 43
  • 63
1

Sorry - you cannot state as a fact that string concat = accumulate(v.begin(),v.end(), string("")); must reallocate. A straightforward implementation will, of course. But compilers are very much allowed to do the right thing here.

This is already the case in C++98, and C++0x continues to allow both smart and dumb implementations. That said, move semantics will make smart implementations simpler.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Here is accumulate() body from gcc 4.1.1: for (; __first != __last; ++__first) __init = __init + *__first; return __init; You're right saying that implementations are allowed optimize -- by providing specializations. For the string it's partially possible, the specialization should: 1) take string as reference or pointer -- that's hard, because function signature is specified by standard 2) use append() or += -- that's possible. But after all I think that proper new tools in the language will allow to solve the problem without so specific specializations. –  Jun 10 '09 at 15:36