What is the performance of a Postgres recursive query with a large depth on millions of rows? Should I use a graph database instead?

Question

I am trying to decide whether I should use a graph database or a relational one. I am fairly new to SQL so I am not sure how to test this as I don't have the millions of rows of data yet but I figured this could become a problem down the line so I am looking for some anecdotal experiences from people if possible.

Does anyone know the performance of a recursive query for a Postgres database with millions of rows in each table? Is the performance usually good enough to be able to query data in the database through a web api in sub 100-200ms with potentially a large depth of, lets say 30? The depth is unknown, it could be 1 or all the way to something large like 30 or even more.

My data currently maps very well to the relational data model as it is relational in nature, hence why I am leaning towards Postgres but I would like to execute a specific recursive query that is making me lean towards a graph database. The tables (lets say 5-6) will most likely be millions of rows each in production and the depth of the query is essentially unknown. An example of such a query would be:

Depth 1: 

EntityA transacts with -> EntityB

Depth 2:

EntityB transacts with -> EntityC
EntityB transacts with -> EntityD

...

Depth n:

until the last entities (leaves) don't transact with any other entities. At each depth, the entities can transact with many entities and the depth is unknown.

Should I use a graph database instead just because of this 1 specific query or can Postgres handle this in a reasonable time using its RECURSIVE command?

I have recursive queries on a ~30million rows table where I retrieve trees with ~2000 elements and depth of about 15 that run in well below a second - but this all depends on the the initial condition (for the non-recursive part) and the indexes you have in place. — , Feb 02 '22 at 21:02
I have recursive queries on a database of ~10M tweets; I use them to combine related tweets within a *thread* (fetching the `referring_to, etc) They run within seconds on a Raspberry Pi. — wildplasser, Feb 02 '22 at 21:20

score 0 · Answer 1 · answered Feb 04 '22 at 18:19

Whenever you're dealing with recursive JOINs in a SQL system that's a dead giveaway that you're actually dealing with graphs and trying to figure out how to navigate a graph with SQL.

Using a graph database is a better option in this case.

For a much more complete answer -- see this other answer on a related question. Performance of arbitrary queries with Neo4j

What is the performance of a Postgres recursive query with a large depth on millions of rows? Should I use a graph database instead?

1 Answers1