0

I am implementing hierarchical clustering using Jaccard distance. The transactions for which I am trying to find Jaccard are represented in binary. For eg.:

t1=['0','1','1','0','1']

t2=['1','0','1','0','0'].

I looked at this SO question, which is very similar to what I want, but I am not getting the right answer.

Basically this is what I am looking for:
1. find intersection and union for the above 2 lists.

I have tried the below apart from looking at numerous other online resources:

1. s1=sets.Set(['0','1','1','0','1'])
   s2=sets.Set(['1','0','1','0','0'])  
2. s1.intersection(s2)  ---> Set(['1', '0'])  
   s1.union(s2)         ---> Set(['1', '0'])  
3. Set(s1) & Set(s2)      ---> TypeError: unsupported operand type(s) for /: 'Set' and 'Set'

   Set(s1) | Set(s2)

Please guide me.

Thanks.

Community
  • 1
  • 1
Sarvavyapi
  • 810
  • 3
  • 23
  • 35

1 Answers1

2

As you said:

s1=sets.Set(['0','1','1','0','1'])

Let's check s1:

print s1
---->Set(['1', '0'])

sets module provides classes for constructing and manipulating unordered collections of unique elements. So, your s1 and s2 are actually the same.

squid
  • 2,597
  • 1
  • 23
  • 19
  • But if you see the positions of the 0's and 1's, they are not the same. I want the intersection of the 2 sets (s1 and s2) to be 1/5 (because there is a "1" at position 3 for both) and the union to be 4/5 (there are 4 1's when we combine the 2 sets) – Sarvavyapi Nov 17 '12 at 15:40
  • I couldn't find out a way to solve this using sets, but I wrote a function that checks every field in the list. – Sarvavyapi Nov 18 '12 at 17:39
  • 1
    @Sarvavyapi , Elements of the set are different because of the index, so maybe you can can form a set with index of '1's. – squid Nov 19 '12 at 08:30