1

I'm currently having trouble finding a solution for my problem and you guys are my last hope. Since two days I'm trying to solve this puzzle:

table item:

----------------------------
| id   | item   | customer |
----------------------------
| 1    | banana | custA    |
----------------------------
| 2    | apple  | custA    |
----------------------------
| 3    | orange | custB    |
----------------------------
| 4    | apple  | custB    |
----------------------------

table vendor_prices:

-------------------------------------------------------
| id  |   item    |   price   |   vendor |  timestamp |
-------------------------------------------------------
| 1   |  banana   |   0.23    | VendorA  |  564645564 |
-------------------------------------------------------
| 2   |  orange   |   0.21    | VendorA  |  564645564 |
-------------------------------------------------------
| 3   |  apple    |   0.19    | VendorB  |  564645564 |
-------------------------------------------------------
| 4   |  banana   |   0.22    | VendorB  |  564645565 |
-------------------------------------------------------
| 5   |  banana   |   0.21    | VendorB  |  564645567 |
-------------------------------------------------------

There are a few things to note:

  1. Not every vendor has every item
  2. Some vendors might update their prices more often than other resulting in a larger gap between them in timestamp

For example, I want to know which vendor sells bananas currently for the best price?

I think I would first need to get the most recent price from each vendor for every item, and then sort those by price, right? But how to do that in a MySQL compatible way?

I think the correct way for the first part is:

SELECT MAX(timestamp), vendor, item, MIN(price) FROM vendor_prices WHERE item="banana" GROUP BY vendor;

But how to connect this to all the other criterias?

EDIT: I had to change the first table a bit, since I forgot an important part in the question, sorry :(

Expected output: The most recent, best price of all vendors for all items for a specific customer in the first table (custA or custB)

Rick James
  • 135,179
  • 13
  • 127
  • 222
nim
  • 35
  • 4

1 Answers1

1

You could use nested groupwise maximum logic for your criteria

select a.*
from vendor_prices a
join (
  select item, min(price) price, max(timestamp) timestamp
  from (
    select d.*
    from vendor_prices d
    join (
      select item, vendor, max(timestamp) timestamp
      from vendor_prices
      group by item, vendor
    ) e using(item, vendor,timestamp )
  ) c
  group by item
) b using (item,price, timestamp)
where a.item = 'banana'

Demo

If you are using Mysql 8+ you can take benefit using window functions

Edit *get all items for a customer (best price and vendor for each item) *

select  i.*,a.*
from vendor_prices a
join (
  select  item, min(price) price, max(timestamp) timestamp
  from (
    select  d.*
    from vendor_prices d
    join (
      select  item, vendor, max(timestamp) timestamp
      from vendor_prices
      group by item, vendor
    ) e using(item, vendor,timestamp )
  ) c
  group by item
) b using (item,price,timestamp )
join item i using(item)
where i.customer = 'custA'
order by i.item

Demo

Using window function and common table expression which are available in Mysql 8 you could use following

with latest_price as(
    select *, 
    dense_rank() over (partition by item order by timestamp desc, price asc ) rnk
    from vendor_prices
    order by  item, rnk
)

select i.id itemid, i.customer,a.* 
from latest_price a
join item i using(item)
where i.customer = 'custA'
and a.rnk = 1

Demo

M Khalid Junaid
  • 63,861
  • 10
  • 90
  • 118
  • 1
    I'm not sure why this has been downvoted, because it produces the correct result. – fubar Jun 28 '18 at 21:14
  • This seems to work, but looks like it is heavy on the DB. I have currently 146k rows in vendor_prices, and the query already takes 0.15s for getting 1 item. But maybe this is normal. Unfortunately, I forgot an important part in my question. The first table has also a column for customers. My goal is, to get all items for an customer (best price and vendor for each item). Sorry for my mistake, my brain is just leaking... – nim Jun 28 '18 at 22:03
  • @nim Yes above query would too heavy for DB but in Mysql there is some limitation if you can use new release like Mysql 8 which has a support for window functions and using window functions which actually made for such problems can give your better performance, For your customer related issue i have updated my answer which just includes a join with item table, this way you will *all items for an customer (best price and vendor for each item)* – M Khalid Junaid Jun 29 '18 at 06:04
  • @fubar i know that user who downvoted this and almost on my most of the answers i guess there is some jealousy factor and he is too shy/scared to mentioned the valid reason for his downvotes, but i really don't care i just focus on contribution to this community as much as i can. – M Khalid Junaid Jun 29 '18 at 06:06
  • @M Khalid Junaid: Thank you for the examples. However, I tried the code on my system, and it returns a lot of duplicates. I think this is due to the fast that it is very possible, that multiple vendors have the same timestamp for an item. I added more data to your example, and the same is happening there: http://sqlfiddle.com/#!9/ea19a9/1 – nim Jun 30 '18 at 20:19
  • @nim I found out that there was one issue in query in upper select it misses the max of timestampt criteria , see updated queries and updated demos in answer – M Khalid Junaid Jun 30 '18 at 21:50
  • @M Khalid Junaid I just rechecked it with my life data, and it seems that sometimes, some items are not shown now. The first time it worked, but after new data has been added, there where just some items missing in the result. Checking now which ones, and if there is anything special with the data. – nim Jun 30 '18 at 23:40
  • @nim - If performance is an issue, check out http://mysql.rjweb.org/doc.php/groupwise_max – Rick James Nov 07 '18 at 19:52