0

I have multiple dataframes(one for each city) with 'Name' column that represents organization names from that city.

How can I visualize the names that are common in every 2 cities and the names that are common for all cities so that it would be easy to understand?

Example:

  df1            df2

  Name           Name       
'Apollo'        'Kims'
'MedWorks'      'AIMs'
'Cradle'        'Apollo'
'Kims'          'Bronte Co'
'Collins'       'Cradle'

There are more than 10 values(names) common for each city. I am not sure if venn diagrams work with string values but even if they do, it will not fit all the data in a good format.

Tried this as suggested but I am getting:

TypeError: unsupported operand type(s) for -: 'str' and 'str'
k92
  • 375
  • 3
  • 15
  • 1
    Possible duplicate of [plot actual set items in python, not the number of items](https://stackoverflow.com/questions/55717203/plot-actual-set-items-in-python-not-the-number-of-items) – Chris Jun 28 '19 at 04:36
  • I don't think this approach works for string values. – k92 Jun 28 '19 at 05:25
  • I am getting: `TypeError: unsupported operand type(s) for -: 'str' and 'str'` – k92 Jun 28 '19 at 05:32

1 Answers1

2

Use matplotlib_venn:

import pandas as pd
from matplotlib_venn import venn2

set1 = set(df1['Name'])
set2 = set(df2['Name'])

venn = venn2([set1, set2])
venn.get_label_by_id('100').set_text('\n'.join(map(str,set1-set2)))
venn.get_label_by_id('110').set_text('\n'.join(map(str,set1&set2)))
venn.get_label_by_id('010').set_text('\n'.join(map(str,set2-set1)))
# venn.get_label is quoted from https://stackoverflow.com/questions/55717203/plot-actual-set-items-in-python-not-the-number-of-items

Output:

enter image description here

Pyd
  • 6,017
  • 18
  • 52
  • 109
Chris
  • 29,127
  • 3
  • 28
  • 51