0

In A* search, which data structure would be more efficient ? Min-heap or Binary search tree.

Considering that below operations are to be handled frequently: (a) extract min (b) search a node (c) update the node (d) insert a node

Note: Search operation would be very frequent as we need to check the presence of each probable child node in the open-list of A*.

aks
  • 147
  • 1
  • 9
  • 1
    It might take a little longer, but you can always implement both and time them to see which one suits you better – byxor Jan 14 '22 at 10:54
  • What exactly do you need *search* for in the A* algorithm? If you refer to updating the score of a node, then you already have the node (from the tree that is being searched), and can make sure you have (and maintain) the connection with the related heap node. I don't see why you would really need to search a certain key in the heap. – trincot Jan 14 '22 at 13:39

1 Answers1

1

A min-heap is a much simpler data structure than a balanced binary search tree, and it's typically implemented in an array, which reduces memory used and improves cache locality.

For these reasons, a min-heap implementation will be much faster if you do it right.

It's often tricky to implement the decrease-key operation in an array-based min-heap, though. The usual solution is not to implement decrease-key at all, but just to insert another record in the min-heap whenever the distance to a node is decreased.

This will not increase the time complexity of the algorithm, the min-heap will take O(|E|) space. If your graph is very dense, then a node's weight may be decreased many times, and this memory consumption might be too much. If that's so, then you should just clean up the min-heap -- remove invalid entries and re-heapify -- whenever more than half of the entries in the heap are invalid. This will keep memory consumption down to O(|V|) without significantly affecting run time.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • @aks, a), no, you just keep the current distance to each vertex in a separate array. You put a node in the heap when the distance decreases, and then you can recognize the invalid heap nodes, because they have a distance that doesn't match. – Matt Timmermans Jan 14 '22 at 19:23
  • @aks, b) no, it means removing invalid heap nodes, which you can recognize in constant time as mentioned in my previous comment. The cleaning process should only require a single scan through the heap array, and then a re-heapify, both of which take O(|V|) time. Note that there will be no valid closed nodes in the heap, so you don't need to look at the closed list. – Matt Timmermans Jan 14 '22 at 19:25
  • Normally vertexes are densely numbered, so an array (that's a list in python) is better than a dictionary for storing vertex information. – Matt Timmermans Jan 14 '22 at 21:28
  • You keep saying you have to do linear searches when linear searches are not required :) – Matt Timmermans Jan 14 '22 at 23:20
  • A 2D numpy array is a good choice – Matt Timmermans Jan 15 '22 at 13:46