0

Given an array of n random numbers, find a O(n*ln n) algorithm to check if it contains repetitive occurrences of some number using only arrays (no other complex data structures).

I got the obvious O(N*N) when you take each element and compare with the rest to check for a match. You can also sort it and compare adjacent elements in n*log n. I am looking for something other than that.

codepk
  • 605
  • 2
  • 10
  • 23
  • 1
    Is this homework? What have you tried? – mrip Oct 11 '13 at 23:26
  • I got the obvious O(N*N) when you take each element and compare with the rest to check for a match. You can also sort it and compare adjacent elements in n*log n. I am looking for something other than that. – codepk Oct 11 '13 at 23:29
  • 1
    Hint: What nlogn algorithms have you covered in this course? – mrip Oct 11 '13 at 23:30
  • do you need for find repetitive occurrences in UNsorted array? or just find if any number occurs several times in array – Iłya Bursov Oct 11 '13 at 23:38
  • @IlyaBursov I wrote the question as it is in 'Introduction to Algorithms' by Cormen. My understanding is to find the first element that repeats twice and return 'Yes'. If no element repeats, return 'No'. The reason I mentioned only arrays (no other DS) and no n*log n sorting is because those topics are not yet covered. – codepk Oct 11 '13 at 23:41
  • Are you allowed to use associative arrays/sets/hash tables/dictionaries? If you are you can do this in O(n) time. If you want exactly O(n log n) I think you would need to sort it unless there are bounds on values in the array. If there are bounds you can do it in linear time using just simple arrays. – Shashank Oct 11 '13 at 23:45
  • Do you have an upper bound on how large a number can be in the array? – Prateek Oct 11 '13 at 23:46
  • @Prateek Please explain - How does that matter? – codepk Oct 11 '13 at 23:48
  • @codepk then you can use counter sort, which will give you complexity of N – Iłya Bursov Oct 11 '13 at 23:48

2 Answers2

3

OK, Let me have a take on this.

  1. Find the largest element(say max) in the array. (O(N) operation)
  2. Create a boolean array(say temp) of size max.
  3. Traverse through the original array and use the current_value as index for temp array and check if it is true i.e. if (temp[current_value] == true)
  4. If it is true you have found the repeating element else set temp[current_value] = true

Obviously this algorithm is not space efficient as we don't know what would be the size of the temp array and most of the spaces in the temp array would never be visited but it's time complexity is O(N)

Prateek
  • 1,916
  • 1
  • 12
  • 22
  • 1
    for usual int type, you can have array with 2^32 numbers, which is the whole memory in 32bit cpu – Iłya Bursov Oct 11 '13 at 23:58
  • This is probably the best solution without sorting. +1. And if they are 4 byte ints then you can just bound it by 2^32 as Ilya said and then you won't need to calculate the max. – Shashank Oct 11 '13 at 23:59
  • As I said depends on the `max` number in the array. We can delay the creation of `temp` array after the `max` is found in the array to check if we should create an temp array of that size of not. – Prateek Oct 11 '13 at 23:59
  • this solution is called [counting sort](http://en.wikipedia.org/wiki/Counting_sort) – Iłya Bursov Oct 11 '13 at 23:59
  • Strange I didn't know if there was a name for this approach I cooked it myself. Thanks @IlyaBursov for pointing that out – Prateek Oct 12 '13 at 00:01
  • For larger ranges of integers it can be extended to [radix sort](http://en.wikipedia.org/wiki/Radix_sort). – Shashank Oct 12 '13 at 00:01
  • @codepk if it is what you were looking for please upvote and accept the solution – Prateek Oct 12 '13 at 00:03
  • quite obvious algorithm, also you can find min, so you will need to create smaller array, but still for worst case (0-2^32-1) will will have memory problems – Iłya Bursov Oct 12 '13 at 00:04
  • @IlyaBursov as I said earlier it is a trade off between time and space complexity. – Prateek Oct 12 '13 at 00:06
  • The assumption here is that it's O(1) to create an array of arbitrary size. That means, for example, you can't initialise it. That's possible, but tricksy (eg: http://recursed.blogspot.ch/2010/09/array-initialization-trick.html). – Paul Hankin Oct 12 '13 at 06:14
1

It's better to replace the array with a hash table.Then, there is no need to find min/max, just start putting your numbers in the hash table as kyes, checking before each "put" whether that key is already there. Note that the array approach cannot handle numbers in a large range, say min=-2^63, max=2^63, even if there are just a few of them. On the other hand, hash table can handle them easily.

However, I just noticed that you want to use only arrays. Then you can simulate the hash table using the array. For details see here: http://algs4.cs.princeton.edu/34hash/ You can select a simple hash function and handle collisions a simple way, for example by putting a collided value into the next available array slot.

P. B. M.
  • 262
  • 2
  • 6