89

I'm currently following Steve Yegge's advice on preparing for a technical programming interview: http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html

In his section on Graphs, he states:

There are three basic ways to represent a graph in memory (objects and pointers, matrix, and adjacency list), and you should familiarize yourself with each representation and its pros and cons.

The pros and cons of matrix and adjacency list representations are described in CLRS, but I haven't been able to find a resource that compares these to an object representation.

Just by thinking about it, I can infer some of this myself, but I'd like to make sure I haven't missed something important. If someone could describe this comprehensively, or point me to a resource which does so, I would greatly appreciate it.

jbeard4
  • 12,664
  • 4
  • 57
  • 67
  • how about [inductive graphs](https://web.engr.oregonstate.edu/~erwig/papers/InductiveGraphs_JFP01.pdf) — which of the 3 categories do these fall under? – Erik Kaplun Aug 21 '16 at 10:26

4 Answers4

100

objects and pointers

These are just basic datastructures like hammar said in the other answer, in Java you would represent this with classes like edges and vertices. For example an edge connects two vertices and can either be directed or undirected and it can contain a weight. A vertex can have an ID, name etc. Mostly both of them have additional properties. So you can construct your graph with them like

Vertex a = new Vertex(1);
Vertex b = new Vertex(2);
Edge edge = new Edge(a,b, 30); // init an edge between ab and be with weight 30  

This approach is commonly used for object oriented implementations, since it is more readable and convenient for object oriented users ;).

matrix

A matrix is just a simple 2 dimensional array. Assuming you have vertex ID's that can be represented as an int array like this:

int[][] adjacencyMatrix = new int[SIZE][SIZE]; // SIZE is the number of vertices in our graph
adjacencyMatrix[0][1] = 30; // sets the weight of a vertex 0 that is adjacent to vertex 1

This is commonly used for dense graphs where index access is necessary. You can represent a un/directed and weighted structure with this.

adjacency list

This is just a simple datastructure mix, I usually implement this using a HashMap<Vertex, List<Vertex>>. Similar used can be the HashMultimap in Guava.

This approach is cool, because you have O(1) (amortized) vertex lookup and it returns me a list of all adjacent vertices to this particular vertex I demanded.

ArrayList<Vertex> list = new ArrayList<>();
list.add(new Vertex(2));
list.add(new Vertex(3));
map.put(new Vertex(1), list); // vertex 1 is adjacent to 2 and 3

This is used for representing sparse graphs, if you are applying at Google, you should know that the webgraph is sparse. You can deal with them in a more scalable way using a BigTable.

Oh and BTW, here is a very good summary of this post with fancy pictures ;)

Cosmo Harrigan
  • 895
  • 1
  • 8
  • 22
Thomas Jungblut
  • 20,854
  • 6
  • 68
  • 91
  • _This approach is cool, because you have O(1) vertex lookup_ this complexity is slightly wrong, in particular it is O(1+alpha) where alpha = num of slots in hash map / num of vertices. Therefore I propose to use array instead of hash map – Timofey Mar 03 '13 at 17:24
  • @Tim it is O(1) amortized. Your complexity calculation is strongly implementation dependend. See the javadoc of `HashMap` (http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html) it says: `This implementation provides constant-time performance for the basic operations` = O(1) amortized. – Thomas Jungblut Mar 03 '13 at 17:53
  • O(1) and O(1) amortized are different complexities, aren't they? Of course we are talking here about hash map implementation as array of lists (not list of lists, for example) and O(1+alpha) is correct complexity of the GET operation – Timofey Mar 03 '13 at 17:59
  • 6
    @Tim I think everybody here knows that array access is faster than any `HashTable` usage. So no need to nitpick arround with a small constant alpha overhead that can be neglected. – Thomas Jungblut Mar 03 '13 at 18:00
  • 2
    Please don't get me wrong, I don't offend you nice answer, but I have a feeling that your answer may be improved, so why not to mention it here :) – Timofey Mar 03 '13 at 18:05
  • 2
    @Tim I added the amortized note into the answer. Thanks. – Thomas Jungblut Mar 03 '13 at 18:09
  • Can't you basically think of every complex object as a sort of directed graph? If you want to, say, serialize something, you're basically using graph traversal algorithms, right? – Casey Apr 06 '15 at 17:11
  • The link to DZone in the last sentence has some very "interesting" time complexities for graph operations on an adjacency list. I would probably add a note that these time complexities completely depend on how the adjacency list is implemented and shouldn't be taken at face value. – Chris Leung Feb 24 '18 at 06:18
  • 1
    @ThomasJungblut, I find that your description and hammar's description of 'Objects and Pointers' are pretty much different. You are creating a new `Edge` object, whereas hammar is keeping pointers to neighbors giving explicit names. But you are saying that _"These are just basic datastructures like hammar said in the other answer"_. Would you make this a bit more clear? – Md. Abu Nafee Ibna Zahid Mar 03 '18 at 15:22
  • The only pertinent information in this response is "it is more readable and convenient for object oriented users," and this is unfortunately, also, not fleshed out with any rationale or examples. – ijoseph May 09 '23 at 01:25
7

Objects and pointers is mostly the same as adjacency list, at least for the purpose of comparing algorithms that use these representations.

Compare

struct Node {
    Node *neighbours[];
};

with

struct Node {
    Node *left;
    Node *right;
};

You can easily construct the list of neighbours on-the-fly in the latter case, if it is easier to work with than named pointers.

hammar
  • 138,522
  • 17
  • 304
  • 385
4

Advantage of the object representation (incidence list) is that two adjacent vertices share the same instance of the edge. This makes it easy to manipulate with undirected edge data (length, cost, flow or even direction). However it uses extra memory for pointers.

Michal Čizmazia
  • 875
  • 1
  • 8
  • 14
  • 5
    why there is a link to the Adjacency list representation named as "incidence list"? Probably it is better to use this one http://www.algorithmist.com/index.php/Graph_data_structures#Incidence_List – Timofey Mar 03 '13 at 18:07
3

Another good resource: Khan Academy - "Representing Graphs"

Besides adjacency list and adjacency matrix, they list "edge lists" as a 3rd type of graph representation. An edge list could be interpreted as a list of "edge objects" like those in Thomas's "objects and pointers" answer.

Advantage: We can store more information about the edge (mentioned by Michal)

Disadvantage: It's a very slow data structure to work with:

  • Lookup an edge: O(log e)
  • Remove an edge: O(e)
  • Find all nodes adjacent to a given node: O(e)
  • Determine whether there exists a path between two nodes: O(e^2)

e = number of edges

Chris Leung
  • 143
  • 2
  • 10