Table "tags":
Source Target Weight
#003 blitzkrank 0.83
#003 deutsch 0.7
#003 brammen 0.57
#003 butzfrauen 0.55
#003 solaaaa 0.5
#003 moments 0.3
college scandal 1.15
college prosecutors 0.82
college students 0.41
college usc 0.33
college full house 0.17
college friends 0.08
college house 0.5
college friend 0.01
The table has 5.600.000 rows and ~91.000 unique entries in column "Source".
For every unique value in "Source" and "Target" I need the top x% rows (e.g. top 20%, top 30%, needs to be variable) by Weight (table is sorted by "Source" (ascending) and "Weight" (descending).
- If rows have the same "Weight" take rows in alphabetical order.
- If x% == 0 take at least one row.
Since there will be duplicates (e.g. "Source = "college" will yield at least one duplicate row as "Target" = "scandal") duplicated entries should be removed if possible. Otherwise not a big deal.
Calculation for "Source":
6 rows where Source = "#003", 6 * 0.2 = 1.2 = take 1 row
8 rows where Source = "college", 8 * 0.2 = 1.6 = take 2 rows
Needed result table for "Source":
Source Target Weight
#003 blitzkrank 0.83
college scandal 1.15
college prosecutors 0.82
How can I do that in SQL in an SQLite database?