-2

I have an array of this kind:
field 4 is the mean of 1,2,3, and field 5 is the min of 1,2,3.

[['name0', 24, 19, 25, 22.67, 19],
 ['name1', 25, 19, 25, 23.0, 19],
 ['name2', 25, 19, 25, 23.0, 19],
 ['name3', 24, 22, 23, 23.0, 22],
 ['name4', 27, 19, 25, 23.67, 19],
 ['name5', 27, 19, 25, 23.67, 19],
 ['name6', 28, 19, 26, 24.33, 19],
 ['name7', 28, 19, 26, 24.33, 19],
 ['name8', 28, 19, 26, 24.33, 19],
 ['name9', 26, 22, 27, 25.0, 22],
 ['name10', 27, 23, 25, 25.0, 23],
 ['name11', 30, 19, 27, 25.33, 19],
 ['name12', 24, 31, 28, 27.67, 24],
 ['name13', 28, 27, 28, 27.67, 27],
 ['name14', 27, 29, 27, 27.67, 27],
 ['name15', 29, 26, 29, 28.0, 26],
 ['name16', 29, 26, 30, 28.33, 26],
 ['name17', 30, 31, 26, 29.0, 26],
 ['name18', 33, 27, 30, 30.0, 27],
 ['name19', 29, 31, 30, 30.0, 29],
 ['name20', 30, 36, 31, 32.33, 30],
 ['name21', 36, 30, 32, 32.67, 30],
 ['name22', 38, 33, 36, 35.67, 33],
 ['name23', 30, 27, 99, 52.0, 27],
 ['name24', 99, 27, 32, 52.67, 27],
 ['name25', 37, 99, 36, 57.33, 36]]

Which has been sorted by field 4 then by field 5.
I'd wish to enumerate this list, creating a sort of "ranking" or "podium".

enumerate() doesn't work because as you can see, some fields are tied on field 4 and 5, so their "rank" should be the same.
As an example, the first values should look like:

[['1', 'name0', 24, 19, 25, 22.67, 19],
 ['2', 'name1', 25, 19, 25, 23.0, 19],
 ['2', 'name2', 25, 19, 25, 23.0, 19],
 ['3', 'name3', 24, 22, 23, 23.0, 22],
 ['4', 'name4', 27, 19, 25, 23.67, 19],
 ...]

Couldn't figure out a clean way to approach this. Thanks for the help.

  • Try looking at `sorted` and `itemgetter`. Doesn't make the rank magically appear, you'll have to figure that out yourself. But enumerate does only enumerating a list.. nothing more ;) – The Pjot Mar 08 '19 at 15:58
  • 1
    Please share the *non-clean* code you tried before we give you the *clean* code. – Austin Mar 08 '19 at 16:00
  • Ranking should be based on which column? – Hello.World Mar 08 '19 at 16:04
  • The non-clean code is not finished, I gave up mid-way as I realised it was going to be unreadable. Also, the ranking should be based on the 4th and 5th column. – Guido Dipietro Mar 08 '19 at 19:33
  • To make a good example you don’t need to put all 100 sublists with all 6 elements. – Mykola Zotko Mar 08 '19 at 20:57

4 Answers4

1

Assuming that the list is sorted, you can group the sub-lists by their 4th and 5th elements using ... the aptly-named groupby, and itemgetter. Use enumerate on the iterator returned by groupby:

from itertools import groupby
from operator import itemgetter

# data = [['name0', ...
[ [str(i+1)] + l for i, (k, g) in enumerate(groupby(data, key=itemgetter(4, 5))) for l in g ]

Output:

[
    ['1', 'name0', 24, 19, 25, 22.67, 19],
    ['2', 'name1', 25, 19, 25, 23.0, 19],
    ['2', 'name2', 25, 19, 25, 23.0, 19],
    ['3', 'name3', 24, 22, 23, 23.0, 22],
    ['4', 'name4', 27, 19, 25, 23.67, 19],
    ['4', 'name5', 27, 19, 25, 23.67, 19],
    ['5', 'name6', 28, 19, 26, 24.33, 19],
    ['5', 'name7', 28, 19, 26, 24.33, 19],
    ['5', 'name8', 28, 19, 26, 24.33, 19],
    ['6', 'name9', 26, 22, 27, 25.0, 22],
    ['7', 'name10', 27, 23, 25, 25.0, 23],
    ['8', 'name11', 30, 19, 27, 25.33, 19],
    ['9', 'name12', 24, 31, 28, 27.67, 24],
    ['10', 'name13', 28, 27, 28, 27.67, 27],
    ['10', 'name14', 27, 29, 27, 27.67, 27],
    ['11', 'name15', 29, 26, 29, 28.0, 26],
    ['12', 'name16', 29, 26, 30, 28.33, 26],
    ['13', 'name17', 30, 31, 26, 29.0, 26],
    ['14', 'name18', 33, 27, 30, 30.0, 27],
    ['15', 'name19', 29, 31, 30, 30.0, 29],
    ['16', 'name20', 30, 36, 31, 32.33, 30],
    ['17', 'name21', 36, 30, 32, 32.67, 30],
    ['18', 'name22', 38, 33, 36, 35.67, 33],
    ['19', 'name23', 30, 27, 99, 52.0, 27],
    ['20', 'name24', 99, 27, 32, 52.67, 27],
    ['21', 'name25', 37, 99, 36, 57.33, 36]
]
meowgoesthedog
  • 14,670
  • 4
  • 27
  • 40
0

Start from i = 1 and iterate through them and assign rank, only incrementing up i += 1 if the next row is different.

def_init_
  • 337
  • 3
  • 10
0

Using Pandas and dense rank:

import pandas as pd

df = pd.DataFrame(data = [['name0', 24, 19, 25, 22.67, 19],
 ['name1', 25, 19, 25, 23.0, 19],
 ['name2', 25, 19, 25, 23.0, 19],
 ['name3', 24, 22, 23, 23.0, 22],
 ['name4', 27, 19, 25, 23.67, 19],
 ['name5', 27, 19, 25, 23.67, 19],
 ['name6', 28, 19, 26, 24.33, 19],
 ['name7', 28, 19, 26, 24.33, 19],
 ['name8', 28, 19, 26, 24.33, 19],
 ['name9', 26, 22, 27, 25.0, 22],
 ['name10', 27, 23, 25, 25.0, 23],
 ['name11', 30, 19, 27, 25.33, 19],
 ['name12', 24, 31, 28, 27.67, 24],
 ['name13', 28, 27, 28, 27.67, 27],
 ['name14', 27, 29, 27, 27.67, 27],
 ['name15', 29, 26, 29, 28.0, 26],
 ['name16', 29, 26, 30, 28.33, 26],
 ['name17', 30, 31, 26, 29.0, 26],
 ['name18', 33, 27, 30, 30.0, 27],
 ['name19', 29, 31, 30, 30.0, 29],
 ['name20', 30, 36, 31, 32.33, 30],
 ['name21', 36, 30, 32, 32.67, 30],
 ['name22', 38, 33, 36, 35.67, 33],
 ['name23', 30, 27, 99, 52.0, 27],
 ['name24', 99, 27, 32, 52.67, 27],
 ['name25', 37, 99, 36, 57.33, 36]], columns= ['1', '2', '3', '4', '5', '6'])

df["rank"] = df['5'].rank(method = "dense")
df

>
    1   2   3   4   5   6   rank
0   name0   24  19  25  22.67   19  1.0
1   name1   25  19  25  23.00   19  2.0
2   name2   25  19  25  23.00   19  2.0
3   name3   24  22  23  23.00   22  2.0
4   name4   27  19  25  23.67   19  3.0
5   name5   27  19  25  23.67   19  3.0
6   name6   28  19  26  24.33   19  4.0
7   name7   28  19  26  24.33   19  4.0
8   name8   28  19  26  24.33   19  4.0
9   name9   26  22  27  25.00   22  5.0
10  name10  27  23  25  25.00   23  5.0
11  name11  30  19  27  25.33   19  6.0
12  name12  24  31  28  27.67   24  7.0
13  name13  28  27  28  27.67   27  7.0
14  name14  27  29  27  27.67   27  7.0
15  name15  29  26  29  28.00   26  8.0
16  name16  29  26  30  28.33   26  9.0
17  name17  30  31  26  29.00   26  10.0
18  name18  33  27  30  30.00   27  11.0
19  name19  29  31  30  30.00   29  11.0
20  name20  30  36  31  32.33   30  12.0
21  name21  36  30  32  32.67   30  13.0
22  name22  38  33  36  35.67   33  14.0
23  name23  30  27  99  52.00   27  15.0
24  name24  99  27  32  52.67   27  16.0
25  name25  37  99  36  57.33   36  17.0

If you want lists of lists -

df = df.set_index('rank').reset_index()
df.values.tolist()
Hello.World
  • 720
  • 8
  • 22
0

You can pair adjacent items by zipping the list with itself after padding one of them with None values, so that you can iterate through the zipped pairs to compare the key fields, and if they are the same, reuse the previous ranking:

for i, ((*_, prev_mean, prev_min), (*_, mean, _min)) in enumerate(zip([(None, None)] + l, l)):
    l[i].insert(0, str(l[i - 1][0] if mean == prev_mean and _min == prev_min else i + 1))

Assuming your list of lists is stored as variable l, l becomes:

[['1', 'name0', 24, 19, 25, 22.67, 19],
 ['2', 'name1', 25, 19, 25, 23.0, 19],
 ['2', 'name2', 25, 19, 25, 23.0, 19],
 ['4', 'name3', 24, 22, 23, 23.0, 22],
 ['5', 'name4', 27, 19, 25, 23.67, 19],
 ['5', 'name5', 27, 19, 25, 23.67, 19],
 ['7', 'name6', 28, 19, 26, 24.33, 19],
 ['7', 'name7', 28, 19, 26, 24.33, 19],
 ['7', 'name8', 28, 19, 26, 24.33, 19],
 ['10', 'name9', 26, 22, 27, 25.0, 22],
 ['11', 'name10', 27, 23, 25, 25.0, 23],
 ['12', 'name11', 30, 19, 27, 25.33, 19],
 ['13', 'name12', 24, 31, 28, 27.67, 24],
 ['14', 'name13', 28, 27, 28, 27.67, 27],
 ['14', 'name14', 27, 29, 27, 27.67, 27],
 ['16', 'name15', 29, 26, 29, 28.0, 26],
 ['17', 'name16', 29, 26, 30, 28.33, 26],
 ['18', 'name17', 30, 31, 26, 29.0, 26],
 ['19', 'name18', 33, 27, 30, 30.0, 27],
 ['20', 'name19', 29, 31, 30, 30.0, 29],
 ['21', 'name20', 30, 36, 31, 32.33, 30],
 ['22', 'name21', 36, 30, 32, 32.67, 30],
 ['23', 'name22', 38, 33, 36, 35.67, 33],
 ['24', 'name23', 30, 27, 99, 52.0, 27],
 ['25', 'name24', 99, 27, 32, 52.67, 27],
 ['26', 'name25', 37, 99, 36, 57.33, 36]]
blhsing
  • 91,368
  • 6
  • 71
  • 106