How can I speed up this SQL query for finding previous entries "on this day"?

Question

I'm trying to speed up this PostgreSQL query to find previous entries "on this day" in past years from a table. I currently have the query below:

select * from sample
where date_part('month', "timestamp") = date_part('month', now())
and date_part('day', "timestamp") = date_part('day', now())
order by "timestamp" desc;

This seems to get the intended result, but it is running much slower than desired. Is there a better approach for comparing the current month & day?

Also, would there be any changes to do a similar search for "this hour" over the past years? Similar to the following:

select * from sample
where date_part('month', "timestamp") = date_part('month', now())
and date_part('day', "timestamp") = date_part('day', now())
and date_part('hour', "timestamp") = date_part('hour', now())
order by "timestamp" desc;

The data is time-series in nature, using TimescaleDB as the database. Here is the current definition:

CREATE TABLE public.sample (
    "timestamp" timestamptz NOT NULL DEFAULT now(),
    entity varchar(256) NOT NULL,
    quantity numeric NULL
);
CREATE INDEX sample_entity_time_idx ON public.sample (entity, "timestamp" DESC);
CREATE INDEX sample_time_idx ON public.sample ("timestamp" DESC);

You want to compare on an actual date range. Somebody else will know the Postgres syntax. Something like `where "timestamp" >= now()::date and "timestamp" < now()::date + interval '1 day'` — shawnt00, Jul 26 '22 at 19:31
Please share your table definition and if there is index, also any of them. — Umut TEKİN, Jul 27 '22 at 00:45
@TheImpaler thanks for the pointer... That's returning all entries for today only, not "on this day from previous years." — jheddings, Jul 27 '22 at 16:48
@jheddings Oops... I didn't understand the question. Let me think... — The Impaler, Jul 27 '22 at 16:58

score 2 · Accepted Answer · answered Jul 27 '22 at 17:03

If you need the same day in all previous years that I would guess that the query would return 1/365th of the rows; that's a 0.27% selectivity. Great.

With that selectivity an index can speed up the query significantly. Now, since you are selecting non-consecutive rows you'll need a functional index. I would try:

create index ix1 on sample ((date_part('doy', "timestamp")));

Then, you can modify your query to:

select * 
from sample
where date_part('doy', "timestamp") = date_part('doy', now())
order by "timestamp" desc;

For the current hour in the past years you would have an even better selectivity of around 1/365/24; that is 0.01%. Awesome.

create index ix2 on sample (
  (date_part('doy', "timestamp")), 
  (date_part('hour', "timestamp"))
);

Then, the new query could look like:

select * 
from sample
where date_part('doy', "timestamp") = date_part('doy', now()) 
  and date_part('hour', "timestamp") = date_part('hour', now()) 
order by "timestamp" desc;

Please post the execution plans of these queries with the indexes created. I'm curious to see how well the perform.

Ah, clever. I didn't consider using a function in the index. When trying to create the index, however, it looks like date_part is not IMMUTABLE (and thus the CREATE INDEX fails). I'll keep poking around. — jheddings, Jul 27 '22 at 17:36
It seems it's the `TIMESTAMPTZ` data type. The function may produce different result in different environments. If you used `TIMESTAMP` the solution works. See https://dbfiddle.uk/?rdbms=postgres_14&fiddle=4a75da04a1b8e13a8b31a88fbfae2af9 In fact, I'm thinking your queries may not be working correctly under all circumstances, but this is just a guess. — The Impaler, Jul 27 '22 at 17:42
Got it, thanks. The data set is an existing table over 7 years with 22M entries, so I'm not sure if I can change the data type... Let me see if I can make some modifications or work on a copy. — jheddings, Jul 27 '22 at 18:04

How can I speed up this SQL query for finding previous entries "on this day"?

1 Answers1