This may be a rudimentary problem but I am new to pandas.
I have a csv dataframe and I want to iterate over each row to extract all the string information in a specific column through regex. . (The reason why I am using regex is because eventually I want to make a separate dataframe of that column)
I tried iterating through for loop but I got ton of errors. So far, It looks like for loop reads each input row as a list or series rather than a string (correct me if i'm wrong). My main functions are iteritems() and findall() but no good results so far. How can I approach this problem?
My dataframe looks like this:
df =pd.read_csv('foobar.csv')
df[['column1','column2, 'TEXT']]
My approach looks like this:
for Individual_row in df['TEXT'].iteritems():
parsed = re.findall('(.*?)\:\s*?\[(.*?)\], Individual_row)
res = {g[0].strip() : g[1].strip() for g in parsed}
Many thanks in advance