2

Presenting the minimal code to describe the problem:

struct A {
  vector<string> v;
  // ... other data and methods
};
A obj;
ifstream file("some_file.txt");
char buffer[BIG_SIZE];
while( <big loop> ) {
  file.getline(buffer, BIG_SIZE-1);
  // process buffer; which may change its size
  obj.v.push_back(buffer);  // <------- can be optimized ??
}
...

Here 2 times string creation happens; 1st time to create the actual string object and 2nd time while copy constructing it for the vector. Demo

The push_back() operation happens millions of times and I am paying for one extra allocation those many times which is of no use for me.

Is there a way to optimize this ? I am open for any suitable change. (not categorizing this as premature optimization because push_back() happens so many times throughout the code).

iammilind
  • 68,093
  • 33
  • 169
  • 336
  • 1
    Use a vector of string pointers? – GWW Sep 28 '11 at 04:55
  • @GWW, I am open to that; but `vector` would be better than `vector` because; once the buffer stored in `vector` it is not going to change in my design. Also, want to know if any better idea. – iammilind Sep 28 '11 at 04:59
  • 1
    Can you use C++11 move-semantics? If so, do so. – GManNickG Sep 28 '11 at 05:01
  • 1
    @GMan, unfortunately not possible. The code should be C++03 compliant. Otherwise `std::move` with rref should be my first choice. – iammilind Sep 28 '11 at 05:01
  • In your demo, aren't you getting one allocation from the vector constructor and the other one from the string constructor? – twsaef Sep 28 '11 at 05:03
  • @iammilind Use the swap trick displayed in KQ's answer to emulate, then. – GManNickG Sep 28 '11 at 05:03
  • @twsaef, you are correct. I din't notice that part. In fact one of the answer shows that too. Edited my question. – iammilind Sep 28 '11 at 05:11

3 Answers3

3

You can try a couple of things. The first is obviously to enable optimization on the compiler. If you can declare it as a vector<const string> that may help.

Otherwise you might try something like:

obj.v.resize(obj.v.size()+1);
obj.v.back().swap(string(buffer));
iammilind
  • 68,093
  • 33
  • 169
  • 336
KQ.
  • 922
  • 4
  • 8
3

Well, you get two allocations, but not both of them are of the string: one of them creates the string, while the other creates just a pointer inside of the vector (note that this depends on the compiler: some compilers/settings might indeed create two strings, but most won't). Look at this code for the demo.

One way to optimize it would be using the char* instead of the string as the template parameter (don't forget to manually delete it before killing the vector!). This way you'll get rid of one (biggest) of the allocations. Alternatively, just use your own implementation of vector: you'll be able to control every aspect of memory allocation then.

Rom
  • 4,129
  • 23
  • 18
0

Instead of having buffer on the stack - put it onto the heap. Then use a vector of pointers. Only one

Ed Heal
  • 59,252
  • 17
  • 87
  • 127