-4

I am fairly good at PHP (OOP & flat). And for the past year or so I have been tasked with maintaining a Ruby codebase; a skill I am still learning. I am not too clear on how to properly structure a filtering chain using Sequel so AND as well OR statements can be properly contained.

This is the MySQL query structure I want to have:

SELECT * FROM `some_objects`
WHERE (
  (
    ((`datebegin` >= 1950) AND (`datebegin` <= 1959)) OR ((`dateend` >= 1950) AND (`dateend` <= 1959))
  )
  OR
  ((`datebegin` <= 1950) AND (`dateend` >= 1959))
  AND
  (NOT `datebegin` = 0) AND (NOT `dateend` = 0)
)
;

And this is the Sequel code fragment I am using:

some_objects = where{((datebegin >= start_year) & (datebegin <= end_year)) | ((dateend >= start_year) & (dateend <= end_year))}.
               or{(datebegin <= start_year) & (dateend >= end_year)}.
               where(~:datebegin => 0, ~:dateend => 0)

And this is what I am actually getting:

SELECT * FROM `some_objects`
WHERE (
  (
    ((`datebegin` >= 1950) AND (`datebegin` <= 1959)) OR ((`dateend` >= 1950) AND (`dateend` <= 1959))
    OR
    ((`datebegin` <= 1950) AND (`dateend` >= 1959))
  )
  AND
  (NOT `datebegin` = 0) AND (NOT `dateend` = 0)
)
;

I have also tried different variations on the same Sequel code such as this:

some_objects = where(:datebegin => start_year..end_year).
               or(:dateend => start_year..end_year).
               or{|o|(o.datebegin <= start_year) & (o.dateend >= end_year)}.
               where(~:datebegin => 0, ~:dateend => 0)

And this:

some_objects = where(:datebegin => start_year..end_year).
               or(:dateend => start_year..end_year).
               or{(datebegin <= start_year) & (dateend >= end_year)}.
               where(~:datebegin => 0, ~:dateend => 0)

But I still end up with the first SQL structure where this whole chunk is basically ((AND OR AND OR)):

(
  ((`datebegin` >= 1950) AND (`datebegin` <= 1959)) OR ((`dateend` >= 1950) AND (`dateend` <= 1959))
  OR
  ((`datebegin` <= 1950) AND (`dateend` >= 1959))
)

When I want ((AND OR AND) OR):

(
  ((`datebegin` >= 1950) AND (`datebegin` <= 1959)) OR ((`dateend` >= 1950) AND (`dateend` <= 1959))
)
OR
((`datebegin` <= 1950) AND (`dateend` >= 1959))
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • Would using the `BETWEEN` operator make this easier? That cuts down on the number of `AND` cases. – tadman Jun 05 '14 at 04:06
  • @tadman I agree. But I am not setting that. But you are looking at the SQL generated by Sequel in Ruby; I outputting MySQL output to get that. What I am dealing with is the way the Ruby Sequel GEM handles SQL generation. So if there is a way to make `BETWEEN` work, sure. Fine. But that is superfluous for now. – Giacomo1968 Jun 05 '14 at 04:07

2 Answers2

6

The issue with your initial query is that you are depending on the OR/AND precedence rules instead of using explicit parentheses. Your initial query could be expressed as:

SELECT * FROM `some_objects`
WHERE (
  ((`datebegin` >= 1950) AND (`datebegin` <= 1959))
  OR
  ((`dateend` >= 1950) AND (`dateend` <= 1959))
  OR
  ((`datebegin` <= 1950) AND (`dateend` >= 1959) AND (NOT `datebegin` = 0) AND (NOT `dateend` = 0))
)

You could express this in Sequel as:

DB[:some_objects].where{((datebegin >= start_year) & (datebegin <= end_year)) | ((dateend >= start_year) & (dateend <= end_year))}.
  or{((datebegin <= start_year) & (dateend >= end_year)) & Sequel.negate(:datebegin => 0)}

Which yields the following SQL:

SELECT * FROM `some_objects` WHERE (
  ((`datebegin` >= 1950) AND (`datebegin` <= 1959))
  OR
  ((`dateend` >= 1950) AND (`dateend` <= 1959))
  OR
  ((`datebegin` <= 1950) AND (`dateend` >= 1959) AND (`datebegin` != 0))
)

This should perform the same as the SQL you used, but it is a bit more readable (e.g. datebegin != 0 instead of NOT datebegin = 0). Note that you don't need the NOT dateend = 0 filter, because it is a subset of the dateend >= 1959 filter.

Jeremy Evans
  • 11,959
  • 27
  • 26
  • Seems like this is close but no cigar. I need something like `where(~:date begin => 0, ~:dateend => 0)` and your code only has ` AND (`datebegin` != 0)`. Is that solved by simply chaining like this: `Sequel.negate(:datebegin => 0).negate(:date end => 0)`. – Giacomo1968 Jun 05 '14 at 15:08
  • You apparently didn't read the end of my post. You don't need the `NOT date_end = 0` condition, as it is a subset of the `dateend >= 1959` condition. But if you really wanted to add a pointless condition: `Sequel.negate(:datebegin => 0, :dateend => 0)` – Jeremy Evans Jun 18 '14 at 22:06
  • “Note that you don't need the NOT date end = 0 filter, because it is a subset of the date end >= 1959 filter.” Correct. Thank you! – Giacomo1968 Jun 18 '14 at 23:03
2

Remember with Sequel if you're really up against the wall, just spell out your conditions the hard way:

records = where(%Q[
  (
    (`datebegin` >= 1950 AND `datebegin` <= 1959)
    OR (`dateend` >= 1950 AND `dateend` <= 1959)
  )
  OR (`datebegin` <= 1950 AND `dateend` >= 1959)
])

Those "zero" dates are kind of annoying from a query perspective, they'd be better off as NULL so they wouldn't need to be excluded here. Since NULL is never greater than, less than, or equal to anything, it's invisible to these queries. Zero, on the other hand, is a valid number and has to be excluded.

As an example of how this works with ranges:

records = where(datebegin: (1950..1959)).or(dateend: (1950..1959))

Those are expressed with the BETWEEN operator. From an indexing perspective that's usually at least a little faster than a bounded range using AND.

tadman
  • 208,517
  • 23
  • 234
  • 262
  • Did you actually read my question? Look at my MySQL ideal versus what you have posted. I do not need a chain of `((AND OR AND OR))`. I need it as `((AND OR AND) OR)`. – Giacomo1968 Jun 05 '14 at 04:16
  • No need to get all testy. Your query is really hard to follow and I'm doing my best. I've just edited it. I'd hope it's pretty obvious you can put your SQL in there as-is and it'll work. – tadman Jun 05 '14 at 04:16
  • The nice thing about Sequel is it'll just do what you tell it to, but it has some convenience methods that make it easier to describe that. When those fail, do it the hard way. The [documentation on filtering](http://sequel.jeremyevans.net/rdoc/files/doc/dataset_filtering_rdoc.html) goes into more detail. – tadman Jun 05 '14 at 04:21
  • So here is a silly question, how would I use your code & change `1950` to `start_year` and `1959` to `end_year`? Simple string substitution by placement? Would that be by `#{start_year}` and `#{end_year}`? – Giacomo1968 Jun 05 '14 at 04:24
  • 1
    Using string interpolation is [asking for trouble](http://bobby-tables.com/). That's why there's support for data placeholders like `?` where they're not named, or `:name` where they are. Normally you can do this with binding like `where('datebegin > :start_year', start_year: 1950)` or something like that. I'm not sure if you can use the same placeholder more than once, but you can try. The first few examples in that linked documentation go into more detail. – tadman Jun 05 '14 at 04:27
  • One more question to just confirm, so if I change this, `where(:datebegin => start_year..end_year).or(:date end => start_year..end_year)` to this `where(:date begin => (start_year..end_year)).or(:date end => (start_year..end_year))` Sequel will use `BETWEEN`, correct? – Giacomo1968 Jun 05 '14 at 04:50
  • 1
    Obviously the best way to find out is to run it interactively. Remember you can always append `.sql` on the end to see what is intending to do. There's no mystery here unlike with ActiveRecord. – tadman Jun 05 '14 at 04:58