1

I have two lists in Python, like these:

temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']

I need to create a third list with items from the first list which will be different with exact number of elements existing in temp2, I need to create below :

temp3 = ['A','A','C']

What is the best way of doing that ? Using sets is not working as expected, so that would like to now is there a fast way to do it with python standart functions or i have to create my own function ?

Hayra
  • 456
  • 2
  • 7
  • 22

4 Answers4

4
temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']
# create a copy of your first list
temp3 = list(temp1)
# remove every item from the second list of the copy
for e in temp2:
    temp3.remove(e)

Output:

['A', 'A', 'C']
Alexander Kosik
  • 669
  • 3
  • 10
  • Thanks, it is working, but this is brute-force solution and routes me to way that i need to create my own method, what i was asking is to use one of "standart" python functions or not, in any case thanks for your kind reccomendation. – Hayra Jun 21 '20 at 11:23
  • I don't think, that there is a builtin function which does the work for you. – Alexander Kosik Jun 21 '20 at 11:34
  • For long lists this will be very time consuming. – kabanus Jun 21 '20 at 11:57
4

If the lists are guaranteed to be sorted you can do much better in terms of time complexity than list.remove or counting every iteration using:

temp1 = ['A', 'A', 'A', 'B', 'C', 'C', 'C']
temp2 = ['A', 'B', 'C', 'C']

filtered = []
j = 0
for i, letter in enumerate(temp1):
    while j < len(temp2) and temp2[j] < letter:
        j += 1
    if j == len(temp2):
        break

    if temp2[j] > letter:
        filtered.append(letter)
    else:
        j += 1

filtered.extend(temp1[i:])

Another solution

A more interesting solution I thought of:

from collections import Counter
result = []
for letter, count in (Counter(temp1)-Counter(temp2)).items():
    result.extend([letter]*count)

This is the same big O complexity as the above.

If lists are not sorted

If order is not important these solutions are still much faster, since sorting the lists is cheaper than the O(n^2) solutions, and the second one doesn't even need that. If it is, this still works, you just need to retain a mapping of element->index (which your temp1 already is) before sorting, though this might be out of scope for this question.

kabanus
  • 24,623
  • 6
  • 41
  • 74
  • Definitely worth using a counter, because this is O(n) -- assuming dictionary access is O(1). There's a straightforward O(n) solution based on counter, which preserves the ordering in list1, which I've added as a separate answer. – alani Jun 21 '20 at 15:21
3
from collections import Counter

temp1 = ['A', 'A', 'A', 'B', 'C', 'C', 'C']
temp2 = ['A', 'B', 'C', 'C']

result = []

counts = Counter(temp2)

for item in temp1:
    if item in counts and counts[item]:
        counts[item] -= 1
    else:
        result.append(item)

print(result)

Output:

['A', 'A', 'C']

Scales O(n) and does not rely on sorted input.

This answer relies on the fact that Counter is just a subclass of dict, so we can use the instance as a mutable object in which to store the number of occurrences in temp2 that we still need to exclude from the result during the iteration over temp1. The documentation states explicitly that "Counter is a dict subclass" and that "Counter objects have a dictionary interface", which is a pretty good guarantee that item assignment will be supported, and that it is not necessary to treat it as a read-only object that must first be copied into a plain dict.

alani
  • 12,573
  • 2
  • 13
  • 23
2

You can try

temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']
temp3 = []

for i in temp1:
    if temp1.count(i) - temp2.count(i) > temp3.count(i):
        temp3.append(i)
print(temp3)

This code will check if in temp3 all the diff elements init and if not it will append the relevant temp1 item to the temp3 list.

Output

['A', 'A', 'C']
Leo Arad
  • 4,452
  • 2
  • 6
  • 17
  • This is pretty cool, though even less efficient then the `remove` solution. Gave me an idea though. – kabanus Jun 21 '20 at 11:39