4

I want to find frequency of pairs in a 2D array. Sample inputs is as follows:

list_of_items = [[12,14,18],[12,19,54,89,105],[ 14, 19],[54, 88 ,105,178]]

Expected Output is as following:

(12,14):1
(12,18):1
(12,19):1
(12,54):1
(12,88):0
.
.
.
(54,105):2
.
.

I have tried following code but I think it is not optimal solution:

number_set = [ 12, 14, 18,19,54,88,89 , 105, 178]

def get_frequency_of_pairs(list_of_items, number_set):
    x=1
    combination_list = []
    result = {}
    for i in number_set:
       for j in range(x,len(number_set)):
           combination_list = combination_list +[(i,number_set[j])]
       x = x+1
    for t in combination_list:
        result[t]=0
    for t in combination_list:
       for items in list_of_items:
           if( set(t).issubset(items) ):
              result[t]=result[t]+1
    return result
Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
Amit Jaiswal
  • 985
  • 1
  • 9
  • 16
  • Your code gives 0 for all the pairs… – Eric O. Lebigot Aug 25 '15 at 11:59
  • Is `number_set` always the set of numbers present in `list_of_items`? The best solution to the problem depends on the answer to this question. – Eric O. Lebigot Aug 25 '15 at 12:04
  • Yes @EOL number_set is union of all the numbers present in list_of_items – Amit Jaiswal Aug 25 '15 at 12:07
  • 1. Should the pairs be in present in same array? For example, is 19 from one sub list and 54 from another sublist a valid pair? 2. Also, in number set, what is the way pairs can be formed? Is (14, 12) a valid pair? – Priyesh Aug 25 '15 at 12:09
  • 1
    @EOL I have corrected it. Now it will give expected results. Thanks for pointing out the problem – Amit Jaiswal Aug 25 '15 at 12:25
  • Your code also only produces counts for pairs *ordered* like in the original sublists (for example, (105, 54) does not appear there)): I guess that this is what you want? – Eric O. Lebigot Aug 26 '15 at 00:34

1 Answers1

5

You can use combinations from itertools and use a Counter from collections as follows:

counts = collections.Counter()
list_of_items = [[12,14,18], [12,19,54,89,105], [14,19], [54,88,105,178]]
for sublist in list_of_items:
    counts.update(itertools.combinations(sublist, 2))

print counts
Counter({(54, 105): 2, (88, 105): 1, (54, 89): 1, (19, 105): 1, (12, 14): 1, (14, 19): 1, (14, 18): 1, (12, 89): 1, (12, 19): 1, (89, 105): 1, (12, 18): 1, (19, 89): 1, (19, 54): 1, (105, 178): 1, (88, 178): 1, (54, 178): 1, (12, 105): 1, (12, 54): 1, (54, 88): 1})

Each pair has to be enumerated to be counted, and this method enables you to enumerate each pair only once. Should be best possible time complexity.

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
Ben
  • 380
  • 1
  • 9