0

I have the following scheme built using PostgreSQL -

Table A:
    id | name
    1  | a
    2  | b 
Table B:
    id | parent-A-id  | searchValue | dataB
    1  | 1            | val1        | a
    2  | 2            | val2        | b
    3  | 2            | val2        | c
Table C:
    id | parent-A-id  | dataC
    1  | 1            | x
    2  | 2            | y
    3  | 2            | z
Table D:
    id | parent-A-id  | parent-B-id | searchValue-REF 
    1  | 1            | 1           | val1
    2  | 2            | 2           | val2
    2  | 3            | 2           | val2
    2  | 3            | 3           | val2

I am trying to retrieve the following data -

Given N, find n amount of B, its parent A and a subset of children C, using only the items D that comply to a function running on searchByVal-REF, meaning a computed result is calculated (e.g - location), against N value.

An example of an output of data based on N would be -

[
    {
        A.name : "a",
        A.id : 1
        B.id : 1,
        B.dataB : "a",
        [{C.id : 1, C.dataC : "x"}],
        computedVal : someVal
    },
    {
        A.name : "b",
        A.id : 2
        B.id : 2,
        B.dataB : "b",
        [{C.id : 2, C.dataC : "y"}],
        computedVal : someVal
    },
     {
        A.name : "b",
        A.id : 2
        B.id : 3,
        B.dataB : "c",
        [{C.id : 2, C.dataC : "y"}, {C.id : 3, C.dataC : "z"}],
        computedVal : someVal
    }
]

The data is basically rows of Bs & parent A of each B & n children C of each B in a serialized array of rows represented in a single column per row B.

I am not sure if I'm over-complicating it, but at the moment I'm trying to group a set of records C into a single column by turning an array of results into a serialized json. But I've yet to manage to get it work as I'm trying to reference a subquery on an outer query which fails as I think that the query is too nested.

This is where I'm stuck at -

SELECT
  b_items_found.*,
  row_to_json(C_json_arr) as C_list
FROM
  (
    SELECT
      *
    FROM
      (
        SELECT
          array_agg("parent-C-id") as selected_Cs,
          "parent-B-id",
          computedVal
        FROM
          (
            SELECT
              ComputeA(searchValue-REF) AS computedVal,
              "parent-B-id",
              "parent-C-id"
            FROM
              D
              CROSS JOIN (
                SELECT
                  ComputeC(N)
              ) AS r
            WHERE
              ComputeB(searchValue-REF)
            ORDER BY
              ComputeA(searchValue-REF)
          ) select-b-items
        GROUP BY
          "parent-B-id",
          computedVal
      ) b_and_c
      JOIN B ON b_and_c."parent-B-id" = B.id
      JOIN A ON B."parent-a-id" = A.id
  ) b_items_found,
  (
    SELECT
      array_agg(row_to_json(t)) as m
    from
      (
        SELECT
          *
        from
          C
        WHERE C.id = ANY(b_items_found.selected_Cs)
      ) t
  ) C_json_arr

This currently results in -

There is an entry for table "b-items-found", but it cannot be referenced from this part of the query.

I am wondering if there's a single query solution to this or if it should be separated into several queries (querying & processing server side).

Running PostgreSQL 9.6.8

Yehonatan
  • 3,168
  • 7
  • 31
  • 39
  • is this a CONNECT BY issue? – Randy May 19 '19 at 14:23
  • @GordonLinoff The data is basically rows of Bs & parent A of each B & n children C of each B in a serialized array of rows represented in a single column per row B. It's like I've described in the question – Yehonatan May 19 '19 at 14:27
  • @Randy Could be, I'm not sure, I've seen that it's an Oracle feature (?), I'm currently using PostgreSQL – Yehonatan May 19 '19 at 14:29
  • @GordonLinoff I've updated the question's "result query" to reflect the question better – Yehonatan May 20 '19 at 06:39

1 Answers1

1

Although the query really looks overly complicated, your main issue is that you are trying the access the temporary table b_items_found in a query where the optimizer may decide to execute the following query first which is why you cannot reference it (no sibling-references, only children).

To do that, you'd have to use a CTE (Common Table Expression), like so:

WITH b_items_found AS (
    SELECT
      *
    FROM
      (
        SELECT
          array_agg("parent-C-id") as selected_Cs,
          "parent-B-id",
          computedVal
        FROM
          (
            SELECT
              ComputeA(searchValue-REF) AS computedVal,
              "parent-B-id",
              "parent-C-id"
            FROM
              D
              CROSS JOIN (
                SELECT
                  ComputeC(N)
              ) AS r
            WHERE
              ComputeB(searchValue-REF)
            ORDER BY
              ComputeA(searchValue-REF)
          ) select-b-items
        GROUP BY
          "parent-B-id",
          computedVal
      ) b_and_c
      JOIN B ON b_and_c."parent-B-id" = B.id
      JOIN A ON B."parent-a-id" = A.id
  ),
  C_json_arr AS (
    SELECT
      array_agg(row_to_json(t)) as m
    from
      (
        SELECT
          *
        from
          C
        WHERE C.id = ANY(b_items_found.selected_Cs)
      ) t
  )
SELECT
  b_items_found.*,
  row_to_json(C_json_arr) as C_list
FROM b_items_found, C_json_arr

However, that's not tested as your query does not match your table structure.

Ancoron
  • 2,447
  • 1
  • 9
  • 21
  • Thanks for the answer! I've also updated the question's "query result" in order to reflect better what I'm trying to achieve – Yehonatan May 20 '19 at 06:40
  • I've tried to execute this query, but it resulted in C items being attached to every row no matter if they exist in selected_Cs or not. I want, for each selected B to select only its Cs which are in its selected_Cs – Yehonatan May 20 '19 at 08:31
  • I have just restructured your query, not altering its logic. But I also haven't tested it. As I also said in the beginning, I have concerns that the query is overly complicated. In addition, your query includes a column `D.parent-C-id` which doesn't exist in your model description. Last but not least you're `CROSS JOIN`ing `D` with a sub-select on a function return as `r` but then you don't select anything from `r` so it's totally unclear why we need `N` here. – Ancoron May 20 '19 at 19:21