0

How do I merge the unique values in columns A and B from df into list?

id  A   B
1   x   k
2   y   x
3   x   l
4   x   k

This is what I am trying to achieve:

list = ['x', 'y', 'k', 'l']

Failed attempt:

list = df['A'].unique().tolist()
list.append(df['B'].unique())

The output:

['x', 'y', ['x', 'k', 'l']]
chuachu
  • 23
  • 5

2 Answers2

3

Use np.ravel and np.unique:

# Global unique
>>> np.unique(np.ravel(df[['A', 'B']])).tolist()
['k', 'l', 'x', 'y']

# Per column unique
>>> df['A'].unique().tolist() + df['B'].unique().tolist()
['x', 'y', 'k', 'x', 'l']
Corralien
  • 109,409
  • 8
  • 28
  • 52
2

Use pandas.unique with numpy.ravel for unique values in original ordering:

L = pd.unique(np.ravel(df[['A','B']], order='F')).tolist()

Another solutions with DataFrame.unstack or DataFrame.melt and Series.unique

L = df[['A','B']].unstack().unique().tolist()
L = df[['A','B']].melt()['value'].drop_duplicates().tolist()
L = df[['A','B']].melt()['value'].unique().tolist()
L = df[['A','B']].melt()['value'].drop_duplicates().tolist()
print (L)
['x', 'y', 'k', 'l']
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thank you but the output gives the unique values in each column. Both columns have the save values and I do not want them to repeat in the list. The output is: ['x', 'y', 'x', 'k', 'l'] – chuachu Mar 06 '23 at 08:12
  • @maerynore - Added 3 solution with ordering of original data. – jezrael Mar 06 '23 at 08:20