1

I'm a beginner in learning python. I'm doing data manipulation of csv using pandas. I'm working on two csv files. Extract.csv as the working file and Masterlist.csv as Dictionary. The keywords I'm supposed to use are strings from the Description column in the Extract.csv. I have the column of keywords in the Masterlist.csv and I have to pull corresponding values and assign to other columns named "Accounts" ,"Contact Name" and "Notes" using those keywords.

Here's the code I've tried:

file2 = open('Masterlist.csv','r')
data2 = pd.read_csv(file2)
df2 = pd.DataFrame(data2)
content=()
for rows in range(len(content)):
          if df2['Keywords'].isin(df['Description']):
              df['Accounts'] = df2['Accounts']
              df['Contact Name'] = df2['Vendor Name']
              df['Notes'] = df2['Notes']
              print()

and

file2= open('Masterlist.csv','r')
data2= pd.read_csv(file2, usecols= ['Keyterms','Accounts','Vendor Name'])
df2= pd.DataFrame(data2)
content=()
for rows in range(len(content)):
          if df[Description'].str.contains(content[df2['Keywords']]):
              df['Accounts'] = content[(df2['Accounts'])]
              df['Contact Name'] = content[(df2['Vendor Name'])]
              df['Notes'] = content[(df2['Notes'])]
              print()

Both codes process but the values appeared as blank.

ATD
  • 11
  • 2
  • `file2= open('Masterlist.csv','r') data2 = pd.read_csv(file2) df2 = pd.DataFrame(data2)` FYI, this can be a single line. `df2 = pd.read_csv('Masterlist.csv')`. pd.read_csv returns a DataFrame anyway and can take the path of the csv file as the argument. – dogekali Apr 13 '23 at 14:49

1 Answers1

0

There are a couple of things wrong with your code.

content=()
for rows in range(len(content)):

This loop won't run. You defined an empty tuple. Then you try to iterate through a loop a number of times equal to the size of the tuple (which is zero since it's empty). That's why your code runs and you get no output.

What you want is something like this (not tested):

import pandas as pd

# Sample dataframes
df1 = pd.DataFrame({'Description':['This John','This Perry','This Tom']})
df2 = pd.DataFrame({'Keyword':['John','Perry','Tom'], 'Accounts':[1,5,10], 'Notes':['John is cool','Perry is also cool','Tom isnt cool']})

col_dict = {}
for col in df2.columns[1:]:
    col_dict[col] = dict(zip(df2.Keyword, df2[col]))
    df1[col] = df1.Description.apply(lambda x: pd.np.nan)
    for i in df2.Keyword:
        df1.loc[df1.Description.str.contains(i), col] = col_dict[col][i]
df1

df1 would then look something like: |Description|Accounts|Notes| |-|-|-| |This John|1.0|John is cool| |This Perry|5.0|Perry is also cool| |This Tom|10.0|Tom isnt cool|

dogekali
  • 91
  • 6