I am trying to write the output from a defined function in a new column in pandas dataframe & export it to excel, however when I open the excel I see blank values in the derived column.
Example & the code used is given below.
Dataframe name = data
Text1 I am the very model of a modern Major-General
Text2 I am the very model of a cartoon individual
import pandas as pd
import difflib
from difflib import SequenceMatcher
original = data['Text1'].values.tolist()
edited = data['Text2'].values.tolist()
df = pd.DataFrame({
'text1': original,
'text2': edited,
})
def compare_row(row):
text1, text2 = row
a=text1.split()
b=text2.split()
sm = SequenceMatcher(None,a,b)
for tag, i1, i2, j1, j2 in sm.get_opcodes():
print('{:7} a[{}:{}] --> b[{}:{}] {!r:>9} -->
{!r}'.format( tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2]))
df['Change'] = df.apply(compare_row, axis=1)
Output received when using print command
equal a[0:7] --> b[0:7] ['I', 'am', 'the', 'very', 'model', 'of',
'a'] --> ['I', 'am', 'the', 'very', 'model', 'of', 'a']
replace a[7:9] --> b[7:9] ['modern', 'Major-General'] -->
['cartoon','individual']