Advantage of Binary Search Tree over vector in C++

Question

What is the use of data structure Binary Search Tree, if vector (in sorted order) can support insert,delete and search in log(n) time (using binary search)??

Sorted vectors are more cache-friendly and sometimes very useful. But for fast and frequent insertion (that keeps the structure sorted) you need a balanced binary tree, e.g. the red-black trees used in std::map and std::set. To insert in the sorted vector, you need to move elements. — Erik Alapää, Dec 16 '15 at 07:25
The simplest answer would highlight the fact that a binary search tree is nonlinear, and thus has many uses in which a linear data structure would be less optimal. — badfilms, Dec 16 '15 at 08:12
how can you do insert and delete for a sorted vector in O(log(n)) ? — DU Jiaen, Dec 16 '15 at 08:30
The premise of the question is wrong (see my first comment). — juanchopanza, Dec 16 '15 at 09:19
See also [boost flat associative containers](http://www.boost.org/doc/libs/1_59_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx) — juanchopanza, Dec 16 '15 at 09:22

Martin Bonner supports Monica · Accepted Answer · 2015-12-16T08:34:28.033

7

The basic advantage of a tree is that insert and delete in a vector are not O(log(n)) - they are O(n). (They take log(n) comparisons, but n moves.)

The advantage of a vector is that the constant factor can be hugely in their favour (because they tend to be much more cache friendly, and cache misses can cost you a factor of 100 in performance).

Sorted vectors win when

Mostly searching.
Frequent updates but only a few elements in the container.
Objects have efficient move semantics

Trees win when

Lots of updates with many elements in the container.
Object move is expensive.

... and don't forget hashed containers which are O(1) search, and unordered vectors+linear search (which are O(n) for everything, but if small enough are actually fastest).

edited Dec 16 '15 at 08:34

answered Dec 16 '15 at 08:28

Martin Bonner supports Monica

28,528
3
51
88

1

I tend to prefer map over unordered map since I come from a hard real-time background - hash tables are efficient 99% of the time, but if a re-hash occurs, real-time performance is destroyed. Red-black trees have much better worst-case behaviour than hash tables. – Erik Alapää Dec 16 '15 at 11:15
@ErikAlapää : Good point. Also if you have a bad hash function, performance can be dire. – Martin Bonner supports Monica Jan 12 '16 at 09:27

score 3 · Answer 2 · answered Dec 16 '15 at 07:24

There won't be much difference in performance between a sorted vector and BST if there are only search operations after some initial insertions/deletions. As binary search over vector will cost you same as searching a key in BST. In fact I would go for sorted vector in this case as it's more cache friendly.

However, if there are frequent insertions/deletions involved along with searching, then a sorted vector won't be good option as elements need to move back and forth after every insertion and deletion to keep vector sorted.

Actually, if predominantly searching, a sorted vector will tend to win because they are much more cache friendly. — Martin Bonner supports Monica, Dec 16 '15 at 08:29

score 0 · Answer 3 · answered Dec 16 '15 at 08:41

0

Theoretically there's impossible to do insert or delete in a sorted vector in O(log(n)). But if you really want the advantage of searching in BST vs vector, here's somethings I can think about:

BST and other tree structures take bulk of small memory allocations of "node", and each node is a fixed small memory chunk. While vector uses a big continuous memory block to hold all the items, and it double (or even triple) the memory usage while re-sizing. So in the system with very limited memory, or in the system where fragmentation happens frequently, it's possible that BST will successfully allocate enough memory chunks for all the nodes, while vector failed to allocate the memory.

answered Dec 16 '15 at 08:41

DU Jiaen

955
6
14

I've never heard of a vector which triples. x2 or x3/2 are much more common factors (x3/2 trades less wasted memory for more reallocations) – Martin Bonner supports Monica Dec 16 '15 at 09:15
x2 is actually triple the memory usage, if you plus the original buffer. – DU Jiaen Dec 16 '15 at 11:50
Ah yes. I missed the "*while* resizing" – Martin Bonner supports Monica Dec 16 '15 at 12:30

Advantage of Binary Search Tree over vector in C++

3 Answers3