2

I have this list

MAIN = [
        ['ABC', '562', '112', '80', '231', '217', '433', '115', '10'],
        ['ABC', '562', '112', '80', '231', '322', '202', '432', '12'],
        ['ABC', '562', '112', '80', '231', '677', '133', '255', '64'], 
        ['DEF', '711', '87', '319', '433', '981', '400', '100', '09'],
        ['DEF', '711', '87', '319', '433', '113', '210', '321', '51'],
        ['DEF', '711', '87', '319', '433', '921', '711', '991', '44']
       ]

and I want to generate 2 lists from MAIN list.

1- First get list A that would have elements from index 0 to index 4 for each sublist in MAIN, resulting in this

A = [ 
     ['ABC', '562', '112', '80', '231'],
     ['ABC', '562', '112', '80', '231'],
     ['ABC', '562', '112', '80', '231'],
     ['DEF', '711', '87', '319', '433'],
     ['DEF', '711', '87', '319', '433'],
     ['DEF', '711', '87', '319', '433']
    ]

and remove duplicates to finally get this A list:

A = [ 
     ['ABC', '562', '112', '80', '231'],
     ['DEF', '711', '87', '319', '433'],
    ]

2 - Get list B that would have elements with index 0 and from index 5 to index 8 for each sublist in MAIN, resulting in this

B = [
     ['ABC', '217', '433', '115', '10'],   
     ['ABC', '322', '202', '432', '12'],
     ['ABC', '677', '133', '255', '64'], 
     ['DEF', '981', '400', '100', '09'], 
     ['DEF', '113', '210', '321', '51'], 
     ['DEF', '921', '711', '991', '44']
    ]

Below my attemps so far:

To get list A

A = []
for z in MAIN:
    y = z[:5]
    if not (y in A):
        A.append(y)

To get list B

B = []
for z in MAIN:
    B.append(list(set(z) - set(z[1:5])))

In results below it seems list A is ok, but list B has the sublists in different order and last sublist has a missing element.

A = [
     ['ABC', '562', '112', '80', '231'], 
     ['DEF', '711', '87', '319', '433']
    ]

B = [
     ['217', '433', 'ABC', '10', '115'], 
     ['322', '202', '432', 'ABC', '12'], 
     ['255', '64', '677', 'ABC', '133'], 
     ['09', '100', '400', '981', 'DEF'], 
     ['113', '51', '210', '321', 'DEF'], 
     ['DEF', '44', '991', '921']
    ]

How would be a best method to assure the correct output for A and B? Thanks for any help.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Ger Cas
  • 2,188
  • 2
  • 18
  • 45

1 Answers1

5

You can apply a slice to each element using a comprehension:

[x[:5] for x in MAIN]

The best way I know to remove duplicates is to use a set. However, you can't add lists to it, so you'll have to wrap the slices in a tuple:

A = list(set(tuple(x[:5]) for x in MAIN))

If you want the elements to be lists instead of tuples, you'll have to convert them explicitly:

A = list(map(list, set(tuple(x[:5]) for x in MAIN)))

You can't rely on set to create slices for you because it does not guarantee order or preserve duplicates. Instead, just append slices together:

B = [x[:1] + x[5:9] for x in MAIN]

Notice that the slice x[:1] (a.k.a. x[0:1]) creates a one-element list, while x[0] would return the same element as a scalar. Any index n can be rewritten as slice n:n+1 this way.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • I onle could say, Excellent!! Thanks so much. It seems I'll learn a lot from your solution, since there are several concepts used. I'm applying a dataframe to the lists, but I've tried `A` with elements as lists and as tuples and it seems it works in both ways. `df_A = pd.DataFrame(A, columns=columns_A)` and `df_B = pd.DataFrame(B, columns=columns_B)`. But thanks to share how to get `A` as list of lists and `A` as list of tuples. – Ger Cas Aug 07 '21 at 02:28
  • If you're using dataframes, you really shouldn't be treating them as lists at all. None of this applies well to dataframes. The whole point of pandas is to provide fast bulk processing without having to run explicit loops or comprehensions. – Mad Physicist Aug 07 '21 at 02:31
  • I'm not sure if I undertood. I'm parsing a file, then I store the data in lists. Once I have the list finished, I convert to dataframes to print it with pandas feature. Is not the good way to do it? – Ger Cas Aug 07 '21 at 03:31
  • @GerCas. I'd use pandas for everything at that point – Mad Physicist Aug 07 '21 at 13:03