I have two arrays, but they have different lengths. I want to compare these two arrays and put common items to a new array. meanwhile there should not be have duplicate items is the third array. I really mess up with this, please give me a help. highly thankful . . .
3 Answers
Something like this?
NSMutableSet* set1 = [NSMutableSet setWithArray:array1];
NSMutableSet* set2 = [NSMutableSet setWithArray:array2];
[set1 intersectSet:set2]; //this will give you only the objects that are in both sets
NSArray* result = [set1 allObjects];
This has the benefit of not looking up the objects in the array, while looping through another array, which has N^2 complexity and may take a while if the arrays are large.
Edit: set2 doesn't have to be mutable, might as well use just
NSSet* set2 = [NSSet setWithArray:array2];

- 2,812
- 25
- 41

- 4,743
- 2
- 26
- 38
-
1FWIW, of course `intersectSet:` will have to do something similar (go over items and compare), but it is far more optimized for it, so it may, in the end, be faster to copy all these items around, to and from sets. OTOH, the simple loop proposed by Akshay, may be faster, as it doesn't copy so much. One would have to profile both approaches for the situation in which it is required. – Rudy Velthuis Aug 30 '11 at 16:27
-
@Rudy Velthuis I think intersetSet: would have the same complexity in the worst case, but often perform better. And setWithArray will retain, not copy the objects. But sure, straightforward iteration through arrays as Akshay suggests may work better for some cases. – SVD Aug 30 '11 at 16:35
-
@Rudy Velthuis: Any sane set implementation will be optimized for fast membership testing and can scale much better than linearly — testing for membership in a set should be logarithmic in the worst case. Also, neither sets nor arrays copy their members, so there shouldn't be much of a slowdown there. – Chuck Aug 30 '11 at 17:37
-
@Chuck: As I said, copying at least means that each item must be retained. This happens for both arrays and for the result. I know that sets are optimized, but they also need time to perform the comparisons. And initializing a set with an array means that duplicates will have to be rejected too (even if sets can do that pretty fast). Fact is that all that takes time, no matter how optimized a set is. And well, the items in the set will eventually have to be released too. **One can of course *guess* which is faster, but only proper profiling will tell if that is true**. – Rudy Velthuis Aug 30 '11 at 17:48
-
1So I did some simple "manual" profiling (I'll post the code, if someone tells me where I can do that), and the solution with the sets is about 10 times as fast as the one with the simple loop using `containsObject:`. My approach takes approx. 1.5 times as long as the one using sets. – Rudy Velthuis Aug 30 '11 at 18:52
-
Note that my approach copies the arrays using mutableCopy:, then sorts the copies and performs the test. If I don't make mutable copies, but sort the original mutableArrays directly, it is 1.5 times **faster** than the approach using sets! – Rudy Velthuis Aug 30 '11 at 18:58
A third approach (besides using sets or the simple loop checking each item with contains) would be to sort both arrays, and then use two indices:
// approach using sets:
NSArray *arrayUsingSets(NSMutableArray *arr1, NSMutableArray *arr2)
{
NSMutableSet *set1 = [NSMutableSet setWithArray: arr1];
NSSet *set2 = [NSSet setWithArray: arr2];
[set1 intersectSet: set2];
return [set1 allObjects];
}
// my approach:
NSArray *arrayUsingComp(NSMutableArray *arr1, NSMutableArray *arr2)
{
NSMutableArray *results = [NSMutableArray arrayWithCapacity: arr1.count + arr2.count];
// Assumes input arrays are sorted. If not, uncomment following two lines.
// [arr1 sortUsingSelector: @selector(compare:)];
// [arr2 sortUsingSelector: @selector(compare:)];
int i = 0;
int j = 0;
while ((i < arr1.count) && (j < arr2.count))
{
switch ([[arr1 objectAtIndex: i] compare: [arr2 objectAtIndex: j]])
{
case NSOrderedSame:
[results addObject: [arr1 objectAtIndex: i]];
i++, j++;
break;
case NSOrderedAscending:
i++;
break;
case NSOrderedDescending:
j++;
break;
}
}
// NOTE: results are sorted too.
// NOTE 2: loop must go "backward".
for (NSInteger k = results.count - 1; k > 0; k--)
if ([[results objectAtIndex: k] isEqual: [results objectAtIndex: k-1]])
[results removeObjectAtIndex: k];
return results;
}
I did some simple profiling, and if I make mutable copies of the arrays passed in, and sort those, it performs 1.5 times slower than the approach using sets. My approach above seems to perform 1.5 times faster than the approach using sets. If the arrays are guaranteed to be sorted already, my approach will perform even better yet (almost 4 times as fast as the version using sets), since no sorting is required.
Update:
This did not eliminate duplicates, so I added the loop at the end of the routine. Now it is only 3 times as fast as the approach using sets, but still...

- 28,387
- 5
- 46
- 94
-
Intereting, but it doesn't ensure array3 won't have duplicates - say, if array1 and array2 both have 3 elements: 1 1 1 (integers for simple illustration), then array3 will end up having the same 3 elements as well. – SVD Aug 30 '11 at 17:24
-
-
@SVD: worked on it. Now it is only 3 times as fast, if the original items are sorted. If they are not, it is only slightly faster than the approach using sets. – Rudy Velthuis Aug 30 '11 at 19:40
-
I had 93 in one and 81 in the other, all arc4random() % 300, plus some extra duplicates thrown in. I did the tests 10000 times for the timing, on the same arrays. I could try with larger arrays... I just tried with 1793 and 1681 NSNumbers respectively, arc4random() % 3000 and got similar results. – Rudy Velthuis Aug 31 '11 at 01:01
-
But for arrays with many duplicates (each element repeated more than 3 or 4 times), the set solution gets better. – Rudy Velthuis Aug 31 '11 at 01:29
-
1I just tried 100000 and 103245 respectively and I still see my solution being 4 times as fast (if arrays are sorted already). If the arrays must be sorted by my routine, it is still ~20% faster. Note that I set up the arrays thus that if (arc4random() % 4 == 1), I'll add the last object to the array again. So that's quite a percentage of duplicates. – Rudy Velthuis Aug 31 '11 at 01:34
Iterate over array1 & search for it in array2. If it is found, add it to array3 if it does not have it already.
for (MyObject* obj in array1)
{
if([array2 containsObject:obj] && ![array3 containsObject:obj])
[array3 addObject:obj];
}
If your array1 does not have duplicate items, you don't need the 2nd check.