4

I'm trying to see if there is a way to transform a flat list into a hierarchical tree in rethinkdb.

Given this table:

nodes
------
-id
-name
-parent

I can query all with r.db('app').table('nodes') and get a flat list:

[
  {name: "one", id: "1"}
  {name: "two", id: "2", parent: "1"}
  {name: "three", id: "3", parent: "2"}
]

But I'd really like a query that returns the data in hierarchal structure:

[
  {
    name: "one", 
    id: "1",
    children: [
      {
        name: "two", 
        id: "2", 
        children: [
          {name: "three", id: "3"}
        ]
      }
    ]
  }
]

Is this possible in rethinkdb? Postgres has WITH RECURSIVE queries for this. Currently I'm doing the transformation in the application layer but it's getting complex -- for example to fetch a single node I also have to fetch ALL nodes, recursively add its descendants, and return just the requested node. Anyway, would love to find a way to do this if possible in rethinkdb. Thanks!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
geddski
  • 738
  • 7
  • 13
  • There are ways of doing this in RethinkDB but they're very convoluted, aren't thread safe, and wouldn't be the sort of thing you'd ever want to use in production. We could add explicit support for this if there were a compelling use case. Could you describe why you're looking for such a feature? (Feel free to e-mail me -- slava@rethinkdb.com) – coffeemug May 02 '14 at 07:25
  • 1
    Thanks for the response Slava. This recursive query functionality would facilitate any hierarchical data use cases, where you'd normally have to either produce hairy application logic or turn to a graph database. In my case I'm building a mind mapping app, where each node can have n children nodes. I absolutely love this powerful feature of postgres but I'd much prefer the flexible schema of rethinkdb, as nodes can have varying fields (description, picture, checklists, tags, etc). Thoughts? More on recursive queries: http://practiceovertheory.com/blog/2013/07/12/recursive-query-is-recursive/ – geddski May 02 '14 at 16:57
  • 1
    Ahh, I see. Perhaps a graph database would be a better choice? I'll look into introducing convenient recursive functionality, but that might take a while. – coffeemug May 02 '14 at 17:04
  • awesome, keep me posted! – geddski May 02 '14 at 17:21
  • I also want this feature. Currently neo4j is my db of choice but that won't scale for huge trees – Sridhar Sarnobat Oct 24 '14 at 19:22

3 Answers3

1

There's no easy way to do this in RethinkDB, unfortunately. How attached are you to that schema? (If the answer is "not very", what queries do you need to be fast on this table?)

mlucy
  • 5,249
  • 1
  • 17
  • 21
  • 1
    I'm definitely open to other schema options. Maybe having a "children" array of ids in each node would make this a bit easier. The queries that need to be fast are: - getting a single node with all of its descendants attached - updating a single node Could do this as one giant document with nested documents but that doesn't seem ideal. – geddski May 02 '14 at 15:57
1

I have recently faced the same issue. Also wanted to introduce children attribute for each node. But this is is not good, since each new node creation/deletion will cause 2 db write operations.

So, the solution I have come up with is the following:

  1. I use group/ungroup aggregation methods of RethinkDB api
  2. Then process grouped nodes to output the final tree.

For example, taking your input data it would look like this on Node/Express backend:

r.db('app').table('nodes').group('parent').ungroup().run(dbConnection)
.then( groupedByParent => {
  // make a temp hashmap as grouped data from RethinkDB comes as Array 
  // of groups.
  const parentsMap = groupedByParent.reduce( (result, groupData) => {
    // each generated group has `group` prop which is the parent id in
    // our case and `reduction` prop which is the Array of all nodes
    // with this parent id
    const { group, reduction } = groupData

    // read prerequisites at the end of this post to better understand 
    // below contruction
    const parentId = group === null ? 'roots' : group

    result[parentId] = reduction
    return result
  }, {})

  // construct a final tree. parentMap will make it easier to get all 
  // children of particular node
  const tree = parentsMap.roots.map(function mapper(node) {
    // do nothing if this node doesn't have children
    const children = parentsMap[node.id]
    if (typeof children === 'undefined') return node;
    // else recursively iterate over children to grab all sub-children
    node.children = children.map(mapper)
    return node
  });
})

PREREQUISITES: To have it work all nodes must have parent attribute (it must not be missing), so if the node doesn't have a parent its parent attribute will equal null

NOTE: I use ungroup and prepare final tree only for convenience here - to easily manipulated grouped data with standard JavaScript means and not with RethinkDB's special control structures. I guess it would be possible to construct complete tree with only RethinkDB's instructions (e.g. somehow with fold method)

Also grouping will be faster if you create a parent index on the nodes table.

0

If you want a finite set of nested children, you can use a join/subquery on the table itself

First create an index on parent

r.db('app').table('nodes').indexCreate("parent")

Then if you want just one level of children, you can do

r.db('app').table('nodes').merge(function(node) {
    return {
        r.db('app').table('nodes').getAll(node("id"), {index: "parent"}).coerceTo("ARRAY")
    }
})

If you need an arbitrary number of levels, that won't be possible, would it be because things are going to break if you have a circular reference.

neumino
  • 4,342
  • 1
  • 18
  • 17
  • Thanks for that, that is certainly a good way to get the first set of children. Now if there were just a way to do that recursively like in postgres... – geddski May 02 '14 at 16:35