1

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).

user_purchases has only one other field that is of interest here, which is purchase_date.

I am trying to write a Sequel ORM statement that will return a dataset with the following columns:

  • user_id
  • date of the users SECOND purchase, if it exists

So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?

Please note I am looking for a dataset with ALL users returned who have >= 2 purchases

Thanks!

EDIT FOR CLARITY

Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):

   DB[:users].join(:user_purchases, :user_id => :id)
              .select{[:user_id, min(:purchase_date)]}
              .group(:user_id)
Jonah
  • 15,806
  • 22
  • 87
  • 161
  • 1
    Why don't you show us the code and models you're using to generate the query? Describing it doesn't help without the code itself. – the Tin Man Nov 02 '12 at 17:12
  • @theTinMan, Hey, you should have all the info you need to answer the question. I edited the OP to clarify what I'm asking for. Thanks. – Jonah Nov 02 '12 at 17:21

3 Answers3

2

You don't seem to be worried about the dates, just the counts so

DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all

will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like

[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]

If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:

DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
    having(:count > 1).all.map{|row| row[:user_id]}).all

Edit:

I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226

This should do it without the subselect:

DB[:users].
  left_join(:user_purchases, :user_id=>:id).
  select_group(:id).
  select_more{count(:purchase_date).as(:purchase_count)}.
  having(:purchase_count > 1)

It generates this SQL

SELECT `id`, count(`purchase_date`) AS 'purchase_count' 
FROM `users` LEFT JOIN `user_purchases` 
ON (`user_purchases`.`user_id` = `users`.`id`) 
GROUP BY `id` HAVING (`purchase_count` > 1)"
Community
  • 1
  • 1
Sean Redmond
  • 3,974
  • 22
  • 28
0

Generally, this could be the SQL query that you need:

SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;

Try converting that to sequel, if you don't get any better answers.

Mladen Jablanović
  • 43,461
  • 10
  • 90
  • 113
0

The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.

To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:

DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]

Here's some output from Sequel's console:

DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

Add the appropriate select clause:

.select(:user_id, :purchase_date)

and you should be done:

DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • If i follow you, this would be a solution for finding a specific user's 2nd purchase date, given that user's user_id, correct? What I am looking for is the complete dataset of all user's who have made >= 2 purchases. – Jonah Nov 02 '12 at 17:25
  • Ah. That's a different question and isn't clear in the question at all. "...date of the users SECOND purchase, if it exists..." – the Tin Man Nov 02 '12 at 17:28
  • Sorry for the confusion, I edited the OP again to clarify that point – Jonah Nov 02 '12 at 17:31