5

std::string pop_back() : Remove the last element of the string

In the C++ specification it is said that the C++11 string class function pop_back has a constant time complexity.

(to be more precise - Unspecified but generally constant)

http://www.cplusplus.com/reference/string/string/pop_back/

Apart from that I read the draft of C++11 specification and it is said that pop_back is equal to str.erase(str.length() -1). As far as I know the erase function simply allocates a new amount of memory and copies the remaining elements (not deleted) to this memory which will take up to linear time. In the light of this how can the pop_back finish in constant time.

quantdev
  • 23,517
  • 5
  • 55
  • 88
ralzaul
  • 4,280
  • 6
  • 32
  • 51
  • 5
    That's not how std::string::erase works. – Sneftel Aug 15 '14 at 08:25
  • BTW, you have source for this stuff. Read your standard library implementation! It's terse, but it isn't incomprehensible. – Sneftel Aug 15 '14 at 08:26
  • 3
    Also be aware that when a standard function is defined in terms of another (as `pop_back` is defined in terms of `erase`), it is not in general required to actually call that other function. It must have the same effect as the required effects of the other function. – Steve Jessop Aug 15 '14 at 08:28
  • 3
    @SteveJessop Yes. Although a surprising number of them do: all of the implementations of `std::vector<>::pop_back` that I've looked at simply call `std::vector<>::erase`, for example. (Of course, `std::basic_string` is simpler; since the character type is required to be a POD, it doesn't have to worry about destructors.) – James Kanze Aug 15 '14 at 08:34
  • In the link you gave it says the complexity is "unspecified, but generally constant". So it could do what you suggested but usually it will simply reduce the logical size, not the capacity. Of course `push_back()` in string and vector would also be `O(N)` if they have to reallocoate. – CashCow Aug 15 '14 at 08:50
  • I'm not sure why it is claimed to be "unspecified, but generally constant", [sequence.reqmts]/16 and Table 101 seem like a pretty clear specification to me. – Jonathan Wakely Aug 15 '14 at 11:02

2 Answers2

11

It does not have to reallocate.

The function probably just overwrites the last character with a zero and decrements some length information.

Timbo
  • 27,472
  • 11
  • 50
  • 75
  • It can but it can reallocate to a smaller buffer, therefore any pointers / references / iterators you have to its old data might become invalidated. – CashCow Aug 15 '14 at 08:52
  • that is for sure can be done but is this really the case for the pop_back? – ralzaul Aug 15 '14 at 08:53
  • It might do, so don't rely on your iterators / pointers / references being valid after a call to it. – CashCow Aug 15 '14 at 09:22
  • 2
    @CashCow The standard doesn't allow exceptions in `erase` and `pop_back` [21.4.1], so that would rule out reallocation. – molbdnilo Aug 15 '14 at 09:41
  • It does not allow an exception to be thrown but if allocator throws it can catch it and manage the implementation differently. – CashCow Aug 15 '14 at 10:32
-1

With regards to "complexity" you always have to look at the big picture, not just an individual case.

Let's look at growing with push_back() first. If you grow from an empty container to one of a large size N, the implementation will do a number of reallocations. Each time it does, the complexity will be O(M) for the size it is at the time, but the rest of the iterations it will be constant time.

In actual fact the complexity will be the sum of a geometric progression. Let's say it reallocates to a capacity double its size, and let's say it starts at 16.

So the complexity total for those that reallocate will be:

16 + 32 + 64 + 128 + ... + N/2 

(assumption that N is a power of 2) which as you know will sum to

N-16. 

Add in all the ones for the single ones where no reallocation takes place, and your grand sum is close to 2N, which is therefore O(N) for all the N insertions, and therefore constant time per insertion in the big picture, even if this particular one might not be.

Now the case in point. Let's assume that we start with N and do a large series of pop_back() calls. Assuming it does exactly the same in reverse, it will have the same complexity.

Q.E.D.

Of course there are more issues here. For example, a pop_back and erase may not throw, i.e. leak an exception, and a reallocation even to a smaller buffer will first need to allocate more memory to move the data into before releasing the old, so an exception could occur doing this. However the implementation could simply "try" the reallocation to a smaller buffer and if an error occurs, catch it then revert back to a simple "logical" change. Of course it does mean your system is reaching full capacity, and that reducing the sizes of your strings isn't helping.

The likelihood is that, whilst this is open to implementation, most implementations will never reallocate in a pop_back. The only time something like this might happen is if the string is implemented with an internal member buffer for small strings and this length is reached, in which case there is no danger is moving all the data into it, and freeing the "free-store" memory. That cannot throw.

CashCow
  • 30,981
  • 5
  • 61
  • 92
  • 2
    `pop_back` for Sequence Containers is not allowed to throw exceptions. It's not clear whether that applies to `basic_string` or not, and that requirement can be met by trying to reallocate and giving up if it throws, but it does make it less likely that an implementation will bother to reallocate on `pop_back` rather than just destroy the last element and change the size. – Jonathan Wakely Aug 15 '14 at 09:39
  • they are not allowed to leak exceptions. If trying to reallocate to a smaller buffer threw or if object moving threw, it could revert back and use a logical pop_back. That it cannot leak exceptions does not mean exceptions cannot happen in the interim. – CashCow Aug 15 '14 at 10:26
  • Right, that's what I meant by "trying to reallocate and giving up if it throws" – Jonathan Wakely Aug 15 '14 at 10:59
  • It makes it less likely although in the case of resizing down to 0 or a fixed size that can be stored internally, it might throw the entire buffer away. My main point of this post was describing how, even if it did, it would still be considered constant time. In fact the main post shows that insertion in a vector is constant overall, even if an individual one might not be. I would still be cautious though and not rely on pointers / references / iterators to be valid after a pop_back(). – CashCow Aug 15 '14 at 11:01