3

My question is for python pandas. I have two Series and each Series has elements of string as follows: To simplify, I've concatenated two Series in DataFrame.

import pandas as pd
import numpy as np
my_df = pd.DataFrame([['ab', 'bz', 'b'], ['cd', 'ct', 'c'], ['ef', 'ka', np.nan]], columns=['sr_1', 'sr_2', 'intersection'])

enter image description here

Any ideas for this?

Sang-il Ahn
  • 113
  • 7

1 Answers1

3

This is what you can do:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'sr1' : ['ab','cd','ef'] ,
                    'sr2' : ['bz','ct','ka',]})

df1['intersection'] = df1.apply(lambda x: set(x.sr1) & set(x.sr2), axis=1)

df1['intersection'] = df1.intersection.apply(lambda x: list(x)[0] if len(x)>0 else np.nan)

The output:

enter image description here

quest
  • 3,576
  • 2
  • 16
  • 26