3

I'm using Python 3.5 and I'm wondering if there is a more efficient way to do this.

  • I have two lists (list1 and list2).
  • Each element in each list is a set of numbers.
  • In the example below, list1 is basically a 1x4 "matrix" and list2 is a 1x3 "matrix".
  • I want to make a 4x3 matrix that gives the length of the intersection of each element in list1 with each element in list2

Here is some sample code that works, but it's somewhat slow when the length of my lists is in the thousands.

Is there a faster/better way??

Thank you!

list1 = [{1,2,3}, {4,5,6}, {1,2,9}, {4,5,10}] # 1 x 4 "matrix"
list2 = [{1,3,9}, {4,2,8}, {1,0,10}] # 1 x 3 "matrix"

myoutputmatrix = []

for aset in list1:
    small_list = [len(aset & asecondset) for asecondset in list2]
    myoutputmatrix .append(small_list)

myoutputmatrix # [[2, 1, 1], [0, 1, 0], [2, 1, 1], [0, 1, 1]]
pdanese
  • 2,187
  • 4
  • 15
  • 21
  • What you have already looks like the most efficient way to do it synchronously. Consider using multiprocessing to parallelize, if you need it faster. – wim Jun 09 '17 at 16:11
  • Try using `numpy` it has numerous functions for manipulation of matrices – Rohin Kumar Jun 09 '17 at 16:11
  • micro-optimization: `myoutputmatrix = [[len(l1 & l2) for l2 in list2] for l1 in list1]`. Though you're expecting something more considerable in timings, I suppose – RomanPerekhrest Jun 09 '17 at 16:23
  • 1
    This is actually straight-up matrix multiplication, once you convert the inputs into actual matrices where a `1` indicates the presence of a particular number in a particular input set and a `0` indicates absence. – user2357112 Jun 09 '17 at 16:55
  • @user2357112 I hadn't thought about it that way, but you're right. It may be worth it to transform each set into a "full" vector. Maybe that's faster. Thanks. – pdanese Jun 09 '17 at 17:35
  • For the matrix multiplication, get [NumPy](http://www.numpy.org/). It'll be way faster than anything you could write by hand in pure Python. If the matrices are sparse, consider the sparse matrix types in [`scipy.sparse`](https://docs.scipy.org/doc/scipy/reference/sparse.html). – user2357112 Jun 09 '17 at 17:53

0 Answers0