How to traverse nodes efficiently in postgresql?

Question

I'm trying to traverse nodes from the specific one with the recursive clause in PostgreSQL.(Btw I'm new to Postgresql) Here's a simple version of my db tables:

 table: nodes
|--------------|------------------|
|      id      |     node_name    |
|--------------|------------------|
|      1       |         A        |
|--------------|------------------|
|      2       |         B        |
|--------------|------------------|
|      3       |         C        |
|--------------|------------------|
|      4       |         D        |
|--------------|------------------|
|      5       |         E        |


table: links
|--------------|---------------------|-----------------|
|      id      |        id_from      |      id_to      |
|--------------|---------------------|-----------------|
|      1       |           1         |         2       |
|--------------|---------------------|-----------------|
|      2       |           1         |         3       |
|--------------|---------------------|-----------------|
|      3       |           2         |         4       |
|--------------|---------------------|-----------------|
|      4       |           3         |         4       |
|--------------|---------------------|-----------------|
|      5       |           4         |         5       |

So, it's just this simple direct graph. (all edges go left to right)

    B
  /   \
 A     D - E
  \   /
    C

In this situation, what is the efficient way to get all vertices that can be visited starting from A?

What I tried:

dfs solution found at Simple Graph Search Algorithm in SQL (PostgreSQL)

with recursive graph(node1, node2, path) as
(
    select id_from, id_to, ARRAY[id_from] from links
        where id_from = 1
    union all
    select nxt.id_from, nxt.id_to,array_append(prv.path, nxt.id_from)
    from links nxt, graph prv
    where nxt.id_from = prv.node2
    and nxt.id_from != ALL(prv.path)
)
select * from graph

It gave me almost all paths. But it visited D vertex twice. (Perform D -> E logic twice) I want to ignore the visited vertex for efficiency.

So, how can I achieve this? Thanks in advance!

S-Man · Accepted Answer · 2020-11-23T13:54:56.383

It is not that simple. Within a single query, all recursive paths are completely independent. So, each path does not know about what's going on on the sibling path. It does not know that a certain node was already visited by a sibling.

Because SQL queries don't support some kind of global variables, it is impossible to share such information between the recursion paths that way.

I'd recommend to write a function where you can use plsql syntax which solves the problem in a more "common programmatical" way.

How to traverse nodes efficiently in postgresql?

1 Answers1