-1

I'm trying to convert the below dataset into the right format to then plot it into a chord diagram.

    a   b   c   d   e   f   g   h
0   1   0   0   0   0   1   0   0
1   1   0   0   0   0   0   0   0
2   1   0   1   1   1   1   1   1
3   1   0   1   1   0   1   1   1
4   1   0   0   0   0   0   0   0
5   0   1   0   0   1   1   1   1
6   1   1   0   0   1   1   1   1
7   1   1   1   1   1   1   1   1
8   1   1   0   0   1   1   0   0
9   1   1   1   0   1   0   1   0
10  1   1   1   0   1   1   0   0
11  1   0   0   0   0   1   0   0
12  1   1   1   1   1   1   1   1
13  1   1   1   1   1   1   1   1
14  0   1   1   1   1   1   1   0

The result would be a chord diagram showing all the possible combinations between the variables, with each stream width being the count of a particular combination occurrences within the dataset - for example a + b count is 7 in the dataset above (where both are 1).

Dan
  • 431
  • 6
  • 20

1 Answers1

0

I do not know a lot about which could be the best chord diagram library but may I help you a little bit:

first we define our data in a pandas dataset

import pandas as pd

data = [
    [1,   0,   0,   0,   0,   1,   0,   0],
    [1,   0,   0,   0,   0,   0,   0,   0],
    [1,   0,   1,   1,   1,   1,   1,   1],
    [1,   0,   0,   0,   0,   0,   0,   0],
    [1,   0,   1,   1,   0,   1,   1,   1],
    [0,   1,   0,   0,   1,   1,   1,   1],
    [1,   1,   0,   0,   1,   1,   1,   1],
    [1,   1,   1,   1,   1,   1,   1,   1],
    [1,   1,   0,   0,   1,   1,   0,   0],
    [1,   1,   1,   0,   1,   0,   1,   0],
    [1,   1,   1,   0,   1,   1,   0,   0],
    [1,   0,   0,   0,   0,   1,   0,   0],
    [1,   1,   1,   1,   1,   1,   1,   1],
    [1,   1,   1,   1,   1,   1,   1,   1],
    [0,   1,   1,   1,   1,   1,   1,   0]]

dataframe = pd.DataFrame(data, columns = ['a','b','c','d','e','f','g','h'])

now we implement the algorithm

def relationship (columnsList, dataframe):
    result = 0
    for index, row in dataframe.iterrows():
        equal = True
        for col in range(len(columnsList)-1):
            if (equal and row[columnsList[col]] == row[columnsList[col+1]]):
                equal = True
            else:
                equal = False
        result += 1 if equal else 0

    return result

Some Tests

>>> relationship (['a','b','d'], dataframe) # a+b+d
3
>>> relationship (['a','b','h'], dataframe) # a+b+h
4
>>> relationship (['a','b'], dataframe) # a+b
7

The diagram is up to you, I hope you can find this helpful!

sudohumberto
  • 110
  • 7
  • thank you! this works, although it just shows pairwise combinations, whereas I would need to find all the combinations - e.g. a+b, a+b+c, a+d+e, and so on – Dan Jul 11 '19 at 10:26