0

I have a large table (millions of records) that looks something like this. I need to get the top 30% (or 70th percentile) on daily basis. So the results should look something like the section table. The table already has the Bread_Date sorted in ascending order and the consumption_per_eater values are already sorted in descending order.

I understand this is similar to the top-N problem where I may need to write this as part of the select statement:

if(@preBread_Date is null or @preBread_Date != Bread_Date, @row:=1, @row:=@row+1) as 'row', @preBread_Date := Bread_Date

But I am stuck. I don't know what to write because on a daily basis, there are also different number of records. So in the example below, 2012-07-10 has only 3 records, while 2012-07-11 has 6 different records and 2012-07 has 4 records. One of the things I would like to avoid is any join.

Would really appreciate if anyone can show me the query I need to write next. Thanks in advance.


Bread_Date | Consumption_Per_Eater
......................................................................
2012-07-10 | 300.1
2012-07-10 | 290.9
2012-07-10 | 230.5
2012-07-11 | 230.3
2012-07-11 | 340.1
2012-07-11 | 330.9
2012-07-11 | 230.5
2012-07-11 | 210.3
2012-07-11 | 209.4
2012-07-12 | 400.9
2012-07-12 | 300.5
2012-07-12 | 210.3
2012-07-12 | 200.4



Bread_Date | Top_30%_of_that_day
..............................................................................
2012-07-10 | XXXXXXX
2012-07-11 | XXXXXXX
2012-07-12 | XXXXXXX

  • Did you want the top 30% per day? Or the top 30% total each day? – Gohn67 Feb 04 '14 at 08:51
  • I need the top 30% value of consumption per eater per day. So for 2012-07-10, it should be getting the top 30% value from values corresponding to that day (300.1, 290.9, 230.5). And then this needs to be repeated for every single day. – user3201048 Feb 04 '14 at 16:46
  • @gohn67 See my second table in the original question post. – user3201048 Feb 04 '14 at 16:50
  • One question, is it not possible to calculate it for each day individually and store that data in another table? Seems like the data won't change since you only get new data. – Gohn67 Feb 04 '14 at 20:27
  • @Gohn67 correct, I'm not allow / it is not possible to calculate for each day individually. The query interface I have does not allow me to use the mySQL "create" command. – user3201048 Feb 04 '14 at 21:50

0 Answers0