3

I am working on a project which involves working with graphs extracted from some other source. Currently we are using python's networkx module for analysing the graph.

I am now faced with the task of choosing a format for storing the graphs. Pickle seems to be a good choice for a purely python based solution. However we are now in the prototyping stage and there is a significant chance that we will have to switch to C++ for performance and scalability issues.

Therefore I'd like to have my graphs stored in a format widely supported by most graph libraries to minimise the hassle to be faced by future contributors in the project.

Could you please give me some suggestion regarding which format I should use?

espeed
  • 4,754
  • 2
  • 39
  • 51
nofrills
  • 101
  • 3

2 Answers2

4

TGF is your solution.

python example:

 #!/usr/bin/python

import fileinput, re

depends = {}
for line in fileinput.input():
    m = re.match('(.+):\s*(.*)',line) # find every depenency line of the form "<item>: <dependencies>"
    if m:
        item = m.group(1)
        dependency_list = m.group(2)
        print item,item # node definition

        if dependency_list: # there are dependencies
            depends[item] = dependency_list.split() # store the list into a dictionary for later

print "#" # end of node list, start of edge list

for item in depends:
    for dependency in depends[item]:
        print item,dependency # edge definition
Dmitry Zagorulkin
  • 8,370
  • 4
  • 37
  • 60
0

I'm not sur being very relevant here, but wouldn't a graph-based database do the job?

You have a couple of option like Neo4j or AllegroGraph for instance, and you will find easily a couple of bindings for python or any other language and most of the solution also provide a REST API anyway.

note that the first link I provided is not very up-to-date, there are now a lot more solutions, and API for Python are available even if it's written it's not on it. You could also have a look at here (section Graph Databases).

edit I found that it could be interresting to have a look on this as well, it seems to be a suitable format for handling and storing graphes in either a JSON style or delimited text:

Also, you might want to take a look at here:

cedbeu
  • 1,919
  • 14
  • 24
  • But for sure it isn't a very minimalistic solution :o) – cedbeu Jun 22 '12 at 10:35
  • Yes, Neo4j surely is a very relevant option here. Unfortunately, I have not found any in built function of Networkx to handler these databases, but I am looking into it. – nofrills Jun 22 '12 at 12:53
  • @user506877, mmmh I don't know about networkx but at a first glance it doesn't seem to have a support for neo4j. Probably you could interface them quite easily with things like bulbflow ... I edited my answer with some few more details. – cedbeu Jun 22 '12 at 13:32