1

I am writing a query on this table to get the sum of size for all the directories, group by directory where date is yesterday. I am getting no output from the below query.

test.id        test.path           test.size     test.date
1   this/is/the/path1/fil.txt      232.24           2019-06-01
2   this/is/the/path2/test.txt     324.0            2016-06-01
3   this/is/the/path3/index.txt    12.3             2017-05-01
4   this/is/the/path4/test2.txt    134.0            2019-03-23
5   this/is/the/path1/files.json   2.23             2018-07-23
6   this/is/the/path1/code.java    1.34             2014-03-23
7   this/is/the/path2/data.csv     23.42            2016-06-23
8   this/is/the/path3/test.html    1.33             2018-09-23
9   this/is/the/path4/prog.js      6.356            2019-06-23
4   this/is/the/path4/test2.txt    134.0            2019-04-23

SELECT regexp_replace(path,'[^/]+$',''), sum(cast(size as decimal)) 
from test WHERE date > date_sub(current_date, 1) GROUP BY path,size;

Incognito
  • 135
  • 4
  • 14

2 Answers2

2

You must not group by size, only by regexp_replace(path,'[^/]+$','').
Also, since you want only yesterday's rows why do you use WHERE date > '2019%?
You can get yesterday's date with date_sub(current_date, 1):

select 
  regexp_replace(path,'[^/]+$',''), 
  sum(cast(size as decimal)) 
from test 
where date = date_sub(current_date, 1) 
group by regexp_replace(path,'[^/]+$','');
forpas
  • 160,666
  • 10
  • 38
  • 76
0

You probably want WHERE date >= '2019-01-01'. Using % in matching strings, for example your 2019%, only works with LIKE, not inequality matching.

The example you gave looks like you want all rows in calendar year 2019.

For yesterday, you want

  WHERE date >= DATE_SUB(current_date, -1)
    AND date < current_date

This works even if your date column contains timestamps.

O. Jones
  • 103,626
  • 17
  • 118
  • 172