I am using MySQL to store reports from a tool. I am extremely happy with the speed and flexibility with which users can query data. The tool also has some data which is a graph. My question is, is it a good idea to store the graph in MySQL? The number of nodes and edges in the graph is in the millions and queries are usually graph traversals.
-
Generally I would say no. But it depends on your queries. Mysql obviously has no implementation of searching routes etc. which you may or many not need when working with graphs. If you just need to searh in a limited depth like 5-10 I guess mysql is fine. But again, it really depends on the type of *queries* and *dynamic changes* you want to make with your graph. – vbence Mar 06 '11 at 09:12
-
Can your graph be represented as a hierarchy? – orangepips Mar 07 '11 at 16:11
-
Not sure what do you mean of graph. If just some data, and convert to graph when quering, store in mysql. If you store some graph, say gif files, that depends not only your taste, but alos the programming language. Delphi? Java? Perl? php? Desktop app.? Web app.? Not easy to give you suggestion unless you provide more information. – Weiyan Mar 06 '11 at 09:12
-
Wrong sort of [graph](http://en.wikipedia.org/wiki/Graph_(mathematics)). – Orbling Mar 06 '11 at 09:18
-
The OP is talking about a graph data structure. http://en.wikipedia.org/wiki/Graph_(data_structure) Graph oriented DBMS products organize data into records and pointers to records. This data structure is very familiar to programmers. – Walter Mitty Mar 08 '11 at 12:32
3 Answers
MySQL is not created and optimized as a graph database in particular. You might want to try Neo4J which is a good graph database.

- 2,241
- 3
- 22
- 35
Plain SQL is usually unfit for manipulating a graph datastructure. There are techniques to index it, however.
For instance, if yours is not frequently updated, using a GRIPP index will let you handle graph traversal queries extremely well. The latter lets you answer parent-child and depth-related queries in more or less fixed time -- irrespective of the graph's number of nodes or density of links.

- 75,850
- 13
- 131
- 154
SQL databases don't handle graph data very well in general. The problem is that to do a graph traversal you either have to pull the entire graph into memory in a single query, then manipulate it and store the changes, or you have to perform huge amounts of joins to traverse the graph one node at a time, which becomes prohibitively slow. With graphs of the scale you are looking at it would probably be better to use a graph database or to use a memory database like REDIS as a fast caching layer and then persist it in the background.

- 599
- 4
- 12