1

I have a table like this:

id name path
1 John /1
2 Mark /2
3 Kevin /1/3
4 Sarah /1/3/4
5 Andy /2/5
... ... ...

So, I can say that Sarah is Kevin's child which is John's child.

I would like to have this:

id name path number of children
1 John /1 2
2 Mark /2 1
3 Kevin /1/3 1
4 Sarah /1/3/4 0
5 Andy /2/5 0
... ... ... ...

TASK NUMBER 2: Let's say that I have this table too

id income user_id
1 200 1
2 120 1
3 340 2
4 500 3
5 600 5
6 80 5

I can say that John has a Total income of 320$, but if I also want to count John's children, it is 820$ (because id =3 is John's child). So, I would also like a query where I can count all the hierarchical incomes.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Fabio Manniti
  • 151
  • 12
  • @a_horse_with_no_name you're right. I'm sorry, my mistake – Fabio Manniti Mar 08 '22 at 16:08
  • You will do better while *normalizing your table and store only `parent_id` and not a full `path`. – Marmite Bomber Mar 08 '22 at 20:51
  • @MarmiteBomber Really? Why? Because if I put all the path to me it's easier because every user has the "history" of all the tree but if I only put the parent_id then I think it is a little bit harder to understand the grandchildren and great grandchildren and so on – Fabio Manniti Mar 09 '22 at 08:30
  • You will learn why if you one day wakes up and you will see this paths `/**2**/3/6` . So, who is the father of `3`? It can be both `2` and `1`. This is called *normalization* - please check [Database_normalization](https://en.wikipedia.org/wiki/Database_normalization) – Marmite Bomber Mar 09 '22 at 10:31
  • Basically meaning store every piece of data only in **one** place. Even in you small sample there are **two** rows (3,4) where is defined `1` is parent of `3` – Marmite Bomber Mar 09 '22 at 10:37
  • @MarmiteBomber OK, I understand this but if I find some path like /**2**/ it means there is an error because only id must be shown in the path. – Fabio Manniti Mar 09 '22 at 15:26
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/242777/discussion-between-fabio-manniti-and-marmite-bomber). – Fabio Manniti Mar 09 '22 at 15:26

1 Answers1

1

You can do:

select
  t.*,
  (select count(*) from t c where c.path like t.path || '/%') as c_count,
  i.income + (
    select coalesce(sum(i.income), 0) from t c join i on i.user_id = c.id
     where c.path like t.path || '/%'
  ) as c_income
from t
left join (
  select user_id, sum(income) as income from i group by user_id
) i on i.user_id = t.id

Result:

 id  name   path    c_count  c_income 
 --- ------ ------- -------- -------- 
 1   John   /1      2        820      
 2   Mark   /2      1        1020     
 3   Kevin  /1/3    1        500      
 4   Sarah  /1/3/4  0        null     
 5   Andy   /2/5    0        680      

See example at DB Fiddle.

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • The count reported is actually the count of all dependent rows (not the *number of child*), but this is the best what can be done without parsing the path to get the `parent_id` – Marmite Bomber Mar 08 '22 at 21:01
  • @MarmiteBomber but I do want to know ALL dependent rows. Not only the first child – Fabio Manniti Mar 09 '22 at 09:04
  • Well this is fine, but than you should not name the column `number of children` but e.g. `number of dependent nodes` - to avoid confusion @FabioManniti – Marmite Bomber Mar 09 '22 at 10:33