7

I am implementing a sinatra/rails based web portal that might eventually have few many:many relationships between tables/models. This is a one man team and part time but real world app.

I discussed my entity with someone and was advised to try neo4j. Coming from real 'non-sexy' enterprise world, my inclination is to use relational db until it stops scaling or becomes a nightmare because of sharding etc and then think about anything else.

HOWEVER,

  • I am using postgres for the first time in this project along with datamapper and its taking me time to get started very fast
  • I am just trying out few things and building more use cases so I consitently have to update my schema (prototyping idea and feedback from beta) . I wont have to do this in neo4j (except changing my queries)
  • Seems like its very easy to setup search using neo4j . But Postgres can do full text search as well.
  • Postgres recently announced support for json and javascript. Wondering if I should just stick with PG and invest more time learning PG (which has a good community) instead neo4j.

Looking for usecases where neo4j is better, especially at protyping/initial phase of a project. I understand if the website grows I might end up having multiple persistent technologies like s3, relational (PG), mongo etc.

Also it would be good to know how it plays out with Rails/Ruby ecosystem.


Update1:

I got a lot of good answers and seems like the right thing to do is stick with Postgres for now (especially since I deploy to heroku)

However the idea of being schema-less is tempting. Basically I am thinking of a approach where you don't define a datamodel until you have say 100-150 users and you have yourself figured out a good schema (business use cases) for your product , while you are just demoing the concept and getting feedback with limited signups. Then one can decide a schema and start with relational.

Would be nice to know if there are easy to use schema/less persistence option (based on ease to use/setup for new user) that might give up say scaling etc.

codeObserver
  • 6,521
  • 16
  • 76
  • 121
  • 1
    Scaling and sharding are not the primary reasons I would choose a graph database. Can you provide more information about your domain? Are you modeling something that is a network? Will you need to compute any network statistics or run any graph algorithms? The presence of several many-to-many tables may indicate a network, as you could consider these relationships to be edges. What do your edges represent? – Bobby Norton Jun 07 '13 at 18:33

3 Answers3

9

Graph databases should be considered if you have a really chaotic data model. They were needed to express highly complex relationships between entities. To do that, they store relationships at the data level whereas RDBMS use a declarative approach. Storing relationships only makes sense if these relationships are very different, otherwise you'll just end up duplicating data over and over, taking a lot of space for nothing. To require such variety in relationships you'd have to handle huge amount of data. This is where graph databases shines because instand of doing tons of joins, they just pick a record and follow his relationships. To support my statement : you'll notice that every use cases on Neo4j's website are dealing with very complex data.

In brief, if you don't feel concerned with what I said above, I think you should use another technology. If this is just about scaling, schemalessness or starting fast a project, then look at other NoSQL solutions (more specifically, either column or document oriented databases). Otherwise you should stick with PostgreSQL. You could also, like you said, consider polyglot persistence,

About your update, you might consider hStore. I think it fits your requirements. It's a PostgreSQL module which also works on Heroku.

LMeyer
  • 2,570
  • 1
  • 24
  • 35
  • Thanks for suggesting hstore. It looks good and potentially suitable for rapid prototyping and demo-use cases. All the more it is offered by heroku !! ..so my rails apps can use them. Surprisingly I dont see a lot of github examples and blog posts , given that it looks so simple for rapid prototyping. For now will stick to postgres , but will def switch over once I find myself spending more time on schema designs – codeObserver Jun 11 '13 at 04:51
  • turns out there is an active-record hstore gem [but no datamapper gem :( ] gem 'activerecord-postgres-hstore' https://github.com/engageis/activerecord-postgres-hstore – codeObserver Jun 11 '13 at 05:13
  • It does not necessarily meant that there is an enormous amount of data. In our case, we use PostgreSQL to store user data, and data sets, and Neo4J for complexity analysis of a population, and the storage of vast quantities of relationships. It really helps with that data lake. – Andrew Scott Evans May 16 '18 at 17:09
5

I don't think I agree that you should only use a graph database when your data model is very complex. I'm sure they could handle a simple data model/relationships as well.

If you have no prior experience with Neo4j or Postgres, then most likely both with take quite a bit of time to learn well.

Some things to keep in mind when picking:

  1. It's not just about development against a database technology. You should consider deployment as well. How easy is it to deploy and scale Postgres/Neo4j?

  2. Consider the community and tools around each technology. Is there a data mapper for Neo4j like there is for Postgres?

  3. Consider that the data models are considerably different between the two. If you can already think relationally, then I'd probably stick with Postgres. If you go with Neo4j you're going to be making a lot of mistakes for several months with your data models.

  4. Over time I've learned to keep it simple when I can. Postgres might be the boring choice compared to Neo4j, but boring doesn't keep you up at night. =)

Also I never see anyone mention it, but you should look at Riak (http://basho.com/riak/) too. It's a document database that also provides relationships (links) between objects. Not as mature as a graph database, but it can connect a few entities quickly.

ryan1234
  • 7,237
  • 6
  • 25
  • 36
  • ++ for recommending Riak -- love it! However, we had an engineer from Basho round recently to give a tech talk and he completely dismissed links -- they discourage the use of them now rather than simply storing a (list of) keys in the document for the child objects and then have the calling application go get them. – Transact Charlie Jun 07 '13 at 21:01
  • 1
    Ah. Good to know. Yeah I saw the links in the documentation and thought, "WOW! Finally a document database with some 'relations'". They said that since the links used map/reduce to use them in a shallow way - in other words, don't try to make a big graph. Bummed they discourage the practice - I thought it was a cool idea. – ryan1234 Jun 07 '13 at 21:26
5

The most appropriate choice depends on what problem you are trying to solve.

If you just have a few many to many tables, a relational database can be fine. In general, there is better OR-mapper support for relational databases, as they are much older and have a standardized interface and row-column structure. They also have been improved on for a long time, so they are stable and optimized for what they are doing.

A graph database is better if e.g. your problem is more about the connections between entities, especially if you need higher distance connections, like "detect cycles (of unspecified length)", some "what do friends-of-a-friend like". Things like that get unwieldy when restricted to SQL joins. A problem specific language like cypher in case of Neo4j makes that much more concise. On the downside, there are mappers between graph dbs and objects, but not for every framework and language under the sun.

I recently implemented a system prototype using neo4j and it was very useful to be able to talk about the structure and connections of our data and be able to model that one to one in the data storage. Also, adding other connections between data points was easy, neo4j being a schemaless storage. We ended up switching to mongodb due to troubles with write performance, but I don't think we could have finished the prototype with that in the same time.

Other NoSQL datastores like document based, column, key-value also cover specific usecases. Polyglot persistence is definitively something to look at, so keep your choice of backend reasonably separated from your business logic, to allow you to change your technology later if you learned something new.

Thomas Fenzl
  • 4,342
  • 1
  • 17
  • 25
  • First, in my opinion, this is the best answer. I'd like to know more about the reason you switched from neo4j to mongodb. And did you have later some regrets due to switch or are you still satisfied of the switch? thanks – Farah Nov 20 '13 at 00:37