1

I am trying to write the output from a defined function in a new column in pandas dataframe & export it to excel, however when I open the excel I see blank values in the derived column.

Example & the code used is given below.

Dataframe name = data

Text1 I am the very model of a modern Major-General

Text2 I am the very model of a cartoon individual

import pandas as pd

import difflib

from difflib import SequenceMatcher   

original = data['Text1'].values.tolist()
edited   = data['Text2'].values.tolist() 

df = pd.DataFrame({
 'text1': original,
 'text2': edited,
})

def compare_row(row):
     text1, text2 = row

a=text1.split()
b=text2.split()

sm = SequenceMatcher(None,a,b)

for tag, i1, i2, j1, j2 in sm.get_opcodes():
      print('{:7}  a[{}:{}]   -->  b[{}:{}] {!r:>9}  --> 
{!r}'.format( tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2])) 

df['Change'] = df.apply(compare_row, axis=1)

Output received when using print command

equal a[0:7] --> b[0:7] ['I', 'am', 'the', 'very', 'model', 'of', 
'a'] -->  ['I', 'am', 'the', 'very', 'model', 'of', 'a']
replace a[7:9] --> b[7:9] ['modern', 'Major-General'] --> 
['cartoon','individual']

enter image description here

Erfan
  • 29
  • 5

0 Answers0