0

I am trying to do advanced analytic queries to drive a web application. I am using Hapi, Objection, Knex, and Postgres with TimescaleDB. It all is working well for typical relational queries. However, I cannot figure out how to perform this aggregation query that involves joining with an anonymous table that is generated from Postgres's generate_series. I have had to resort to writing the raw SQL, rather than the Objection/Knex query builder. I am using a few of Postgres's built in functions, along with time_bucket from Timescale. time_bucket essentially creates a roll up of the data based on the interval specified in the argument. Check this link out for more information about what I'm attempt to do Gap Filling.

Here's the query, which works using the raw method on the Objection Model. I believe doing string interpolation like this will result in potential SQL injection. However, I was hoping to get this converted into the query builder methods that Objection/Knex use so it's more JavaScript, rather than SQL, which would solve the SQL injection issue.

let errorHistorgram = await Errors
    .raw(`SELECT period AS daily, coalesce(count,0) AS count
    FROM generate_series(date '${startTS}', date '${today}', interval '1d') AS period
      LEFT JOIN (
        SELECT time_bucket('1d',timestamp)::date AS date, count(timestamp)
        FROM my_error_table
        WHERE severity = 'HIGH'
          AND timestamp >= '${startTS}' AND timestamp < '${today}'
          AND device_id = ${deviceId}
        GROUP BY date
      ) t ON t.date = period;`)
      .debug();

I have made several attempts at this with Objection/Knex. This was my most successful attempt at crafting this query. However, I believe that the where clause is not in the correct place.

let errorHistorgram = await Errors
    .query()
    .select(raw(`time_bucket('1 day', timestamp) AS daily, count(timestamp)`))
    .where('device_id', deviceId)
    .andWhere('timestamp', '>', startTS)
    .andWhere('severity', 'HIGH')
    .leftJoin(`generate_series(date ${startTS}, date ${today}, interval 1d) AS series`, 'series.date', 'my_error_table.timestamp')
    .debug();

With .debug(), I'm able to see the output of the query, which is posted below.

select time_bucket('1 day', timestamp) AS daily, count(timestamp)
from my_error_table
left join "generate_series(date 2018-11-08T15:35:33"."050Z, date 2018-11-15T15:35:33"."133Z, interval 1d)" as "series"
  on "series"."date" = my_error_table."timestamp"
where "device_id" = ? and "timestamp" > ? and "severity" = ?'

Any assistance is appreciated, as I have not used Objection to do this, and cannot find any documentation on it.

UPDATE 11/15/2018

I got it to execute the query with Objection. However, I get an empty array as the result. Unlike with the raw SQL query that I crafted above (which does give me the expected results), I just get an empty array as output for the query builder. Any idea as to what I'm doing wrong. I've tried to flip the join to a right join with no luck.

 let errorHistorgram = await Errors
  .query()
  .select(raw(`time_bucket('1 day', timestamp) AS daily, count(timestamp)`))
  .where('device_id', deviceId)
  .andWhere('timestamp', '>', startTS)
  .andWhere('severity', 'HIGH')
  .groupBy('timestamp')
  .rightJoin(raw(`generate_series(date '${startTS}', date '${today}', interval '1d') AS series`), 'series.date', 'my_error_table.timestamp')
  .debug();

Attached is the SQL output from Debug.

select time_bucket('1 day', timestamp) AS daily, count(timestamp) 
from my_errors_table
right join generate_series(date '2018-11-08', date '2018-11-15', interval '1d') AS series
on series = my_errors_table.timestamp
where device_id = ? and timestamp > ? and severity = ? 
group by timestamp
technogeek1995
  • 3,185
  • 2
  • 31
  • 52
  • There are still couple of places where your code might have an sql injection vulnerability. Can you spot the place where the working raw query and knex generated query is different? That will tell you also why query does not return expected result. – Mikael Lepistö Nov 15 '18 at 23:46
  • In terms of the misalignment - I think my main issue is that the SQL query uses a left join. I cannot use a left join because my select statement has to be based upon my model (`my_errors_table`), rather than based upon the `generate_series` function. With knex, I'm having issues telling it to apply the where clause on the first query, rather than on the join. It seems like the where query is being applied to the join, not the actual query. Does this make sense? Any idea how to resolve that? – technogeek1995 Nov 19 '18 at 15:24
  • I know I have the potential for SQL injection because I'm using JS's string interpolation, instead of the knex's query builder. I'm hoping to get this resolved through my ORM so SQL injection is a non-issue to me at this point. – technogeek1995 Nov 19 '18 at 15:25

1 Answers1

0

Timescale released a new feature called Time Bucket Gapfill. It made this much easier because you no longer have to do a left join with the generate_series to gap fill.

I have included an example of how to implement this with an ObjectionJS model called Errors. The inputs to time_bucket_gapfill function are bucket size, timestamp column name, startTS, and endTS. The bucket size variable should be a string with "" (not single quotes) that corresponds to the bucket size (eg: "10 seconds", "30 minutes", "1 hour", "1 day"). The startTS and stopTS should be ISO Date Strings. The second select statement requires COALESCE so that it will output 0 if there is a bucket generated where there is no data contained in the bucket. The group by is required for the bucketing to correctly summarize the data based on the aggregate SQL function you supply in the select statement.

import { raw } from 'objection';

const errors = await Errors
  .query()
  .select(
    raw("time_bucket_gapfill(?, timestamp, ?, ?) AS bucket", [bucketWidth, startTS, endTS]),
    raw('COALESCE(count(timestamp), 0) AS count'),
  ).where('device_id', deviceId)
  .andWhere('timestamp', '>=', startTS)
  .andWhere('timestamp', '<=', endTS)
  .groupBy('bucket')
  .orderBy('bucket');
technogeek1995
  • 3,185
  • 2
  • 31
  • 52