0

I have a list:

my_list = ['a', 'b', 'c', 'a', 'b', 'c', 'a']

I use following code to remove elements that doesn't meet requirements:

[my_list.remove(element) for element in my_list if 'a' not in element]

but instead of expected ['a', 'a', 'a'] got ['a', 'c', 'a', 'c', 'a']. Seems that after removing 'b' Python doesn't check following 'c' elements...

Please advise me how to resolve this issue and efficiently remove all unnecessary elements from list.

Andersson
  • 51,635
  • 17
  • 77
  • 129

5 Answers5

8

Other answers solve the problem, but let me explain what's going on here.

>>> lst = ['a', 'b', 'c', 'a', 'b', 'c', 'a']
>>> for each in lst:
...     if 'a' not in each:
...         lst.remove(each)
>>> lst
['a', 'c', 'a', 'c', 'a']

Iteration 1:

#   V                                     - Current position of loop
# ['a', 'b', 'c', 'a', 'b', 'c', 'a']

if 'a' not in each: #Output False

Iteration 2:

#        V                                - Current position of loop
# ['a', 'b', 'c', 'a', 'b', 'c', 'a']

if 'a' not in each: #Output True
    list.remove(each)  #Element from position 1 ('b') in list is removed

Iteration 3:

#             V                         |___ Supposed to be like this  
# ['a', 'b', 'c', 'a', 'b', 'c', 'a']   |

#             V                         |___ Updated list
# ['a', 'c', 'a', 'b', 'c', 'a']        |

if 'a' not in each: #Output False

That's why your 'c' is skipped in output list.

Now to solve your problem, instead of deleting all non-a, it's a better approach to create a list with only a. (Trengot's Answer)

Edit:

since your my_list is a collection of character, it is better to use if 'a' != element, because 'a' not in element will scan through each letter of element, and it will also remove all elements having letter 'a' (Check this to understand how in works in Python).

For example if your my_list = ['a','abc','fd','b','c'], 'a' not in 'abc' will return False, and element 'abc' won't be removed.

Community
  • 1
  • 1
Ashwani Agarwal
  • 1,279
  • 9
  • 30
3

Filter the list into a new one selecting the elements you do want rather than removing the ones you don't. Then either use the new one or assign it to the old.

my_list = [element for element in my_list if 'a' in element]

As Peter Wood pointed out, this will assign a new object to my_list. If you want to keep the same list object (eg if it's referenced elsewhere as well) assign the new list to my_list[:].

my_list[:] = [element for element in my_list if 'a' in element]
Holloway
  • 6,412
  • 1
  • 26
  • 33
1

Since you want to modify (shrink) the existing list in-place, here's something that does so:

def remove_all_on_predicate(predicate, list_):
    deserving_removal = [elem for elem in list_ if predicate(elem)]
    for elem in deserving_removal:
        list_.remove(elem)
    return None

>>> remove_all_on_predicate(lambda x: "a" not in x, my_list)
>>> my_list
['a', 'a', 'a']
Cong Ma
  • 10,692
  • 3
  • 31
  • 47
1

As you have discovered, attempting to remove elements from a list that you are iterating over may not do what you expect. Ashwani Agarwal's answer illustrates why it fails, and the other answers show various techniques that can be used to perform the removals correctly. Another technique that can be useful when you have a very large list that you can't afford to copy is to iterate over it in reverse:

my_list = ['a', 'b', 'c', 'a', 'b', 'c', 'a']
for element in reversed(my_list):
    if 'a' not in element:
        my_list.remove(element)
        print(element, my_list)

print('Final:', my_list)

my_list = ['a', 'b', 'c', 'a', 'b', 'c', 'a']
for element in reversed(my_list):
    if 'a' in element:
        my_list.remove(element)
        print(my_list)

print('Final:', my_list)

output

c ['a', 'b', 'a', 'b', 'c', 'a']                                                                                                               
c ['a', 'b', 'a', 'b', 'a']                                                                                                                    
b ['a', 'a', 'b', 'a']                                                                                                                         
b ['a', 'a', 'a']                                                                                                                              
Final: ['a', 'a', 'a']                                                                                                                         
['b', 'c', 'a', 'b', 'c', 'a']                                                                                                                 
['b', 'c', 'b', 'c', 'a']
['b', 'c', 'b', 'c']
Final: ['b', 'c', 'b', 'c']

This code uses the reversed() function, which returns an iterator over the iterable you pass to it; it doesn't copy the iterable.

I should mention that this technique is less efficient than the filtering approaches given in other answers. That's because each call of my_list.remove(element) has to scan through my_list until it finds a matching element, so it has complexity O(n**2) where n is the number of elements in the list; the filtering algorithms have a complexity of O(n). So as I said earlier, this approach is only useful when the list is so large that you can't afford the RAM to create a new list.

Another thing I need to mention about the code in your question: you are using a list comprehension to loop over a list when you should be using plain for loop. list.remove() returns None, so your code is needlessly creating a list full of Nones and then throwing that list away. The general rule is: don't use a list comprehension purely for the side effects of a function you call in it.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • Just awesome :) it would be even nice if you show a iteration flow like agarwal did – The6thSense Sep 03 '15 at 09:18
  • 1
    @VigneshKalai: Thanks! But I think my `print()` calls in the loops show the flow adequately. :) – PM 2Ring Sep 03 '15 at 09:20
  • It was just that I got confused at the beginning how your method works and seeing your print I found that `remove` removes the first occurrence of the element I just thought it would be easy to understand for a new python programmer.Last but not least `beautiful answer ` – The6thSense Sep 03 '15 at 09:25
0

I'd use filter

my_list = filter(lambda x: 'a' in x, my_list)
Arthur.V
  • 676
  • 1
  • 8
  • 22