-1

I want to check if elements of list1 are in list2. if so, I want to add up the elements.

It works without an extern .csv file:

list1 = [1,2,3,4,5]
list2 = [1,2,3]

c = sum(el in list1 for el in list2)
print(c)

But when I import a .csv file, the expected answer is incorrect.

Expected answer: 6

import pandas as pd 
list1 = pd.read_csv(r"list1.csv")
list2 = pd.read_csv(r"list2.csv")

cs = sum(el in list1 for el in list2)
print(cs)

list 1 list 2

I think it has something to do with the way I import the files? Or do I need to convert the objects? Your help is welcome :)

  • 2
    "the expected answer is incorrect" - so what answer do you get instead of the correct one? – ForceBru Jan 15 '21 at 17:02
  • Also, you say you "want to check if elements of list1 are in list2", but `el in list1` checks if elements of `list2` are in `list1`. – ForceBru Jan 15 '21 at 17:04

3 Answers3

1

When you read csv with pandas, you get a dataframe, whereas you want to work with series. Even in this case, your dataframes have only one column, you still need to specify the column:

c = list1['A'].isin(list2['A']).sum()

In your case, you can print

print([el for el in list 2])

to see what you are looping over.

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1

Your list1 is not really a list but a DataFrame.

in in numpy is not working like it works on list as you can read here.

So if you want to stick with your list working code you should .tolist() your DataFrame.

or use the isin as suggested in the other answers.

Lior Cohen
  • 5,570
  • 2
  • 14
  • 30
0

You only have to convert list1 and list2 into numpy arrays and it works just fine. Try

import pandas as pd 
import numpy as np
list1 = np.array(pd.read_csv(r"list1.csv"))
list2 = np.array(pd.read_csv(r"list2.csv"))

cs = sum(el in list1 for el in list2)
print(cs)
Dharman
  • 30,962
  • 25
  • 85
  • 135
Janska
  • 41
  • 3