8

I'm trying to display a list of member records, and I have a few tables I'm using to display what I need.

That's the easy part. The part I need help with is with a table that has many records to each member record: Login history

I want to display only the first row for each member record, that exists in the Login History table. Alternatively, I may want to flip flop and display the last record in the Login History table, as well.

here's what I've got so far:

SELECT m.memberid, m.membername, m.gender, mp.phone
FROM tbl_members m, 
     tbl_members_phones mp, 
     tbl_members_addresses ma
WHERE m.defaultphoneid = mp.phoneid
AND m.defaultaddressid = ma.addressid

So that returns what's expected.

The 2 columns from tbl_members_login_history I'd like to add to the returned result are: mh.loggedtime, mh.ipaddy

I know adding the tbl_members_login_history as a LEFT JOIN would return duplicates, so I'm thinking there must be a Subquery necessity here, in order to return just the 1st record for that memberid that exists in tbl_members_login_history.

What I'm worried about is if no record in the history table exists, I still want to display that member info, but leave the history columns as NULL.

Would this be a subquery incident? and if so, how does one add that type of LIMIT?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
coffeemonitor
  • 12,780
  • 34
  • 99
  • 149
  • Out of curiosity, are a phone and address required for a member? If not, I don't think your query will return the other member info if one of them is missing the way it is written. – Sam Jan 23 '12 at 20:03
  • In this case, yes. But you're right, the member would not be returned in the result without an existing record. – coffeemonitor Jan 23 '12 at 20:09

2 Answers2

20

This is the greatest-n-per-group problem, which is asked frequently on Stack Overflow.

Here's how I would solve it in your scenario:

SELECT m.memberid, m.membername, m.gender, mp.phone, mh.loggedtime, mh.ipaddy
FROM tbl_members m 
INNER JOIN tbl_members_phones mp ON m.defaultphoneid = mp.phoneid
INNER JOIN tbl_members_addresses ma ON m.defaultaddressid = ma.addressid
LEFT OUTER JOIN tbl_members_login_history mh ON m.memberid = mh.memberid
LEFT OUTER JOIN tbl_members_login_history mh2 ON m.memberid = mh2.memberid
    AND mh.pk < mh2.pk
WHERE mh2.pk IS NULL;

That is, we want mh to be the most recent row in tbl_member_login_history for the given memberid. So we search for another row mh2 that is even more recent. If none more recent than the mh row is found, then mh2.* will be NULL, so mh must be the most recent.

I'm assuming this table has a primary key column that contains increasing values. For this example, I assume the column name is pk.

Using LEFT OUTER JOIN for both references to the login history table means that the m row will be reported even if there is no matching row.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • I just tried your solution and compared it to a query that contains a subquery using `GROUP BY` with `MAX(...)`, joining `ON date = maxdate`. Your solution with the `<` operator took 2s (!), while the subquery solution took 0.01s (same result table of course). The `EXPLAIN` of your query looks much better than the one with the subquery. I don't know why I get this huge performance difference, but I guess it's because the `<` produces a huge intermediate result. – steffen Jul 23 '14 at 15:03
  • @steffen: Yes, since I answered this question, I've seen a lot of other cases, and concluded that either the solution above or a solution like you describe *can* be faster, depending on how many distinct groups and how many rows per group the data has. So it's best to test both methods with your data and then decide. – Bill Karwin Jul 23 '14 at 16:10
  • True. What confused me was the huge performance difference. I found your solution a couple of times here on SO. And the `EXPLAIN` output look great so I expected that query to be very fast. But instead it was so much slower. Weird. – steffen Jul 23 '14 at 16:14
  • @steffen, keep in mind MySQL does nested-loop joins. So you can get a rough estimate of the work it has to do to perform joins by multiplying the `rows` information from each table joined together. So even if it looks like it's using indexes and everything, if you see a few thousand rows in one table joined to a few thousand rows in another table, it's doing millions of row comparisons. – Bill Karwin Jul 23 '14 at 17:28
0

add like this

LEFT OUTER JOIN (SELECT member_id, MAX(LAST_LOGIN_DATE) from tbl_members_login_history) Last_Login ON Last_Login.memberid = m.memberid

PS. LAST_LOGIN_DATE is pseudo column, you can try your restictive column

kochobay
  • 392
  • 1
  • 7