0

I need to select from a table that has a starts_on field and an ends_on field.
I need to pass start date and end date for filtering and retrieving those objects.

At the moment it's working and I use the following:

SELECT * FROM ***
WHERE ((starts_on >= START_DATE AND starts_on <= END_DATE) OR
       (ends_on >= START_DATE AND ends_on <= END_DATE) OR
       (starts_on <= END_DATE AND ends_on >= END_DATE))
ORDER BY starts_on, id

It looks a bit messy, but can't see an easy way to simplify it. Any idea?
I'm using postgres 9.1 as dbms.

Edit:

 starts_on   | timestamp without time zone | 
 ends_on     | timestamp without time zone | 

Ex: if one entry has starts_on = '2012/02/02' and ends_on '2012/02/05' I want the following behavior:

  • if I filter by start date 2012/01/01 and end date 2012/03/01 I want the item to be returned
  • if I filter by start date 2012/02/04 and end date 2012/03/01 I want the item to be returned
  • if I filter by start date 2012/02/05 and end date 2012/03/01 I want the item to be returned
  • if I filter by start date 2012/02/04 and end date 2012/02/04 I want the item to be returned
  • if I filter by start date 2012/02/06 and end date 2012/03/01 I want the item to NOT be returned
  • if I filter by start date 2012/01/01 and end date 2012/02/01 I want the item to NOT be returned
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
CLod
  • 917
  • 2
  • 11
  • 28
  • Please provide the exact table definition (with relevant columns), which is essential for the question. Also explain how you want to filter exactly. – Erwin Brandstetter Dec 03 '12 at 01:23

1 Answers1

2

Query

If you want all rows where the time period between starts_on and ends_on overlaps with the passed time period of START_DATE and END_DATE, and "end" is always later than "start", and all involved columns are of type timestamp (as opposed to time or date), this simpler query does the job:

SELECT *
FROM   tbl
WHERE  starts_on <= END_DATE
AND    ends_on   >= START_DATE
ORDER  BY starts_on, id;

Fits the question as later clarified.

Index

The best index for this query would be a multi-column index like:

CREATE INDEX tbl_range_idx ON tbl (starts_on, ends_on DESC)

Would work with DESC / ASC almost as well, because an index can be searched in both directions almost equally well.

How do I figure?

The index is searched on the first condition starts_on <= END_DATE, qualifying rows are at the beginning.
From there, Postgres can take all rows that end late enough according to ends_on >= START_DATE. Qualifying rows come first. Optimal index.

But don't just take my word - test performance with EXPLAIN ANALYZE. Run a couple of times to exclude caching effects.

There is also the OVERLAPS operator for the same purpose. Simplifies the logic, but isn't superior otherwise.

And there are the new range types in PostgreSQL 9.2, with their own operators. Not for 9.1 though.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • I added more explanation above in my question – CLod Dec 03 '12 at 01:35
  • @CLod: In short: you want all rows where the time range *overlaps* with the filter in any way. Coincidentally, my answer already covers that. – Erwin Brandstetter Dec 03 '12 at 01:40
  • thanks! another kind of related question: I currently have index on starts_on and another one on ends_on. Is it better if I have an index on both fields if I always use both of them for filtering? – CLod Dec 03 '12 at 03:28
  • @CLod: I added a bit on indexing. Note my edits - previous version was sub-optimal. – Erwin Brandstetter Dec 03 '12 at 07:29
  • yes added indexes at the end, noticed that it reduced query cost. thx heaps – CLod Dec 07 '12 at 03:31