4

After implementing most of the common and needed functions for my Graph implementation, I realized that a couple of functions (remove vertex, search vertex and get vertex) don't have the "best" implementation.

I'm using adjacency lists with linked lists for my Graph implementation and I was searching one vertex after the other until it finds the one I want. Like I said, I realized I was not using the "best" implementation. I can have 10000 vertices and need to search for the last one, but that vertex could have a link to the first one, which would speed up things considerably. But that's just an hypothetical case, it may or may not happen.

So, what algorithm do you recommend for search lookup? Our teachers talked about Breadth-first and Depth-first mostly (and Dikjstra' algorithm, but that's a completely different subject). Between those two, which one do you recommend?

It would be perfect if I could implement both but I don't have time for that, I need to pick up one and implement it has the first phase deadline is approaching...

My guess, is to go with Depth-first, seems easier to implement and looking at the way they work, it seems a best bet. But that really depends on the input.

But what do you guys suggest?

rfgamaral
  • 16,546
  • 57
  • 163
  • 275
  • 1
    How do you define "last vertex" of a graph? In any case, the quickest way to find a given vertex in a graph *completely* depends on what you know about the construction of your graph and the start and endpoints. – Cascabel Mar 18 '10 at 16:54
  • I would go with Breadth-first, because I'm guessing you're talking about recursive DFS. Iterative DFS is a bit harder to implement, and recursion is a bit slower, so I'd choose BFS. – IVlad Mar 18 '10 at 17:00
  • @Jefromi: By "last vertex" of the graph I meant the last vertex on the vertices linked list. @IVlad: Actually, as this project depends on a big data input, I'm trying to code every operation iteratively. So far I have nothing recursively and my idea is to implement iterative DFS, unless I find it hard to finish it... – rfgamaral Mar 18 '10 at 17:13
  • IVlad, if you do it right, switching from iterative BFS to iterative DFS should be as easy as switching a queue to a stack. – user287792 Mar 18 '10 at 17:16
  • @Nazgulled: I don't understand what you mean by "last on the linked list". You say the last vertex could link to the first one - how do you define last then? (A graph is a collection of vertices and edges. It doesn't have a "last vertex", even if you do arbitrarily pick a "first vertex".) – Cascabel Mar 18 '10 at 18:35
  • I guess you probably mean that you don't want the searched-for vertex to be the last one you examine. Anyway, my first comment and Konrad's answer fully address that question, I believe. – Cascabel Mar 18 '10 at 18:39

8 Answers8

5

If you’ve got an adjacency list, searching for a vertex simply means traversing that list. You could perhaps even order the list to decrease the needed lookup operations.

A graph traversal (such as DFS or BFS) won’t improve this from a performance point of view.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Are you saying that using DFS/BFS or traversing the linked list will probably be the same as performance goes? – rfgamaral Mar 18 '10 at 17:10
  • @Nazgulled If you are searching for something, and you stop when you find it, then there may be a difference based on the input data. If you are searching the whole graph for all nodes with a property, then there probably won't be a significant difference. But without specifics of your data it is not something we can predict. – Pete Kirkham Mar 18 '10 at 17:18
2

Finding and deleting nodes in a graph is a "search" problem not a graph problem, so to make it better than O(n) = linear search, BFS, DFS, you need to store your nodes in a different data structure optimized for searching or sort them. This gives you O(log n) for find and delete operations. Candidatas are tree structures like b-trees or hash tables. If you want to code the stuff yourself I would go for a hash table which normally gives very good performance and is reasonably easy to implement.

Arno
  • 173
  • 1
  • 7
1

I think BFS would usually be faster an average. Read the wiki pages for DFS and BFS.

The reason I say BFS is faster is because it has the property of reaching nodes in order of their distance from your starting node. So if your graph has N nodes and you want to search for node N and node 1, which is the node you start your search form, is linked to N, then you will find it immediately. DFS might expand the whole graph before this happens however. DFS will only be faster if you get lucky, while BFS will be faster if the nodes you search for are close to your starting node. In short, they both depend on the input, but I would choose BFS.

DFS is also harder to code without recursion, which makes BFS a bit faster in practice, since it is an iterative algorithm.

If you can normalize your nodes (number them from 1 to 10 000 and access them by number), then you can easily keep Exists[i] = true if node i is in the graph and false otherwise, giving you O(1) lookup time. Otherwise, consider using a hash table if normalization is not possible or you don't want to do it.

IVlad
  • 43,099
  • 13
  • 111
  • 179
  • Actually, I have already implemented an Hash Table that will also hold the same data as in the Graph (the data will be shared with pointers). However, one of the requisites of this uni project is that when we implement our data structures, we also implement the most common operations. I probably won't even search for a specific node in the Graph, I'll use the Hash Table instead. – rfgamaral Mar 18 '10 at 17:21
0

Depth-first search is best because

  1. It uses much less memory
  2. Easier to implement
Peter Alexander
  • 53,344
  • 14
  • 119
  • 168
  • 1
    How does DFS use less memory than BFS? Recursion also uses memory, and the iterative version requires a stack anyway. If anything, DFS uses AT LEAST the same amount of memory if implemented recursively and THE SAME amount of memory if done iteratively. – IVlad Mar 18 '10 at 16:58
  • IVlad - let's look at a potentially contrived example - a graph with a vertex that has 1 million adjacent vertices. A DFS is going to contain at most 2 vertices on the stack at any given time, where as a BFS is going to contain a million on the queue at the start. – Niki Yoshiuchi Mar 18 '10 at 19:46
  • @Niki - and let's look at a linked list graph with 1 million nodes. The exact opposite is true. Generally, the amount of memory used by DFS and BFS is the same. If DFS is implemented recursively, the memory used by DFS might be more (though still O(V)), because we might have multiple parameters for the function plus the function call overhead. If the DFS is implemented iteratively, then the memory used is generally the same even in practice. So saying DFS (or BFS) "uses much less memory" is completely wrong. – IVlad Mar 18 '10 at 20:14
  • @IVlad - You're right, there are cases where BFS uses less memory than DFS. But claiming that "DFS uses AT LEAST the same amount of memory" is false as I've shown, and "It uses much less memory" is false as you've shown. I never made any claims that DFS always uses less memory. – Niki Yoshiuchi Mar 18 '10 at 21:34
0

the depth first and breadth first algorithms are almost identical, except for the use of a stack in one (DFS), a queue in the other (BFS), and a few required member variables. Implementing them both shouldn't take you much extra time.

Additionally if you have an adjacency list of the vertices then your look up with be O(V) anyway. So little to nothing will be gained via using one of the two other searches.

zellio
  • 31,308
  • 1
  • 42
  • 61
  • Well, I actually have a simple stack library implemented already, but not a queue one (but it shouldn't be that hard anyway, I just want to minimize the time wasted, I can bother with it later, if I still have time before the deadline). Anyway, are you saying (second paragraph) that using DFS/BFS or traversing the linked list will probably be the same? – rfgamaral Mar 18 '10 at 17:09
0

I'd comment on Konrad's post but I can't comment yet so... I'd like to second that it doesn't make a difference in performance if you implement DFS or BFS over a simple linear search through your list. Your search for a particular node in the graph doesn't depend on the structure of the graph, hence it's not necessary to confine yourself to graph algorithms. In terms of coding time, the linear search is the best choice; if you want to brush up your skills in graph algorithms, implement DFS or BFS, whichever you feel like.

saramah
  • 168
  • 8
0

If you are searching for a specific vertex and terminating when you find it, I would recommend using A*, which is a best-first search.

The idea is that you calculate the distance from the source vertex to the current vertex you are processing, and then "guess" the distance from the current vertex to the target.

You start at the source, calculate the distance (0) plus the guess (whatever that might be) and add it to a priority queue where the priority is distance + guess. At each step, you remove the element with the smallest distance + guess, do the calculation for each vertex in its adjacency list and stick those in the priority queue. Stop when you find the target vertex.

If your heuristic (your "guess") is admissible, that is, if it's always an under-estimate, then you are guaranteed to find the shortest path to your target vertex the first time you visit it. If your heuristic is not admissible, then you will have to run the algorithm to completion to find the shortest path (although it sounds like you don't care about the shortest path, just any path).

It's not really any more difficult to implement than a breadth-first search (you just have to add the heuristic, really) but it will probably yield faster results. The only hard part is figuring out your heuristic. For vertices that represent geographical locations, a common heuristic is to use an "as-the-crow-flies" (direct distance) heuristic.

Niki Yoshiuchi
  • 16,883
  • 1
  • 35
  • 44
0

Linear search is faster than BFS and DFS. But faster than linear search would be A* with the step cost set to zero. When the step cost is zero, A* will only expand the nodes that are closest to a goal node. If the step cost is zero then every node's path cost is zero and A* won't prioritize nodes with a shorter path. That's what you want since you don't need the shortest path.

A* is faster than linear search because linear search will most likely complete after O(n/2) iterations (each node has an equal chance of being a goal node) but A* prioritizes nodes that have a higher chance of being a goal node.

Default picture
  • 710
  • 5
  • 12