0

Given an unsorted array of size n, it's obvious that finding whether an element exists in the array takes O(n) time.

If we let m = log n then it takes O(2^m) time.

Notice that if the array is sorted, a binary search actually takes O(m) time (which is polynomial) but the binary search cannot apply to an unsorted array.

Is it possible to prove that the problem to find an element in an array (yes or no) is NP complete in terms of m. What problem should I reduce from and how to reduce?

Any idea would be appreciated.

EDIT:

My description above probably did not express clearly what I was trying to say.

Let's reword the problem in the following way.

  1. We have an oracle, which is a binary tree of height h with each node having random values. I.E. a tree that DOES NOT have the property that all values in the left subtree of a node must be smaller than the value in the node or all values in the right subtree of a node must be greater than the value in the node. However all nodes in the oracle tree are guaranteed to have value between 0 and 2^h-1.

  2. The input is a number to be searched. The input is guaranteed to have value between 0 and 2^h-1. (The input has h bits)

(Let's say we are searching through the same array every time and hence we have the same oracle every time so the tree is not a part of input.)

  1. The output is YES or NO, indicating whether the input is in a node of the tree or not.

Question: whether this problem is NP complete or not in terms of h. This problem is NP because if a path to the YES node in the tree is given it can be verified in O(h) time.

(Note that if the oracle tree has the property that left subtree of a node is less than the node and right subtree of a node is greater than the node then the problem is NOT NP complete because binary search can be applied.)

cr001
  • 655
  • 4
  • 16
  • If we let m = cuberoot(n) then it takes O(m^3) time. Perhaps its running time is cubic? No. I don't understand what you're trying to demonstrate by representing the running time in terms of this arbitrary "m" value. – JLRishe Mar 05 '15 at 06:48
  • I am trying to prove that there is no log(n) algorithm for searching in an unsorted array. Since there exists a log(n) algorithm for a sorted one (i.e. binary search) there must be some difference caused by this sorting. If we let m represent the number of bits used to represent each element in an array I felt it might be possible to reduce for example 3SAT to the array searching problem – cr001 Mar 06 '15 at 06:13
  • This question is probably a better fit for the [CS Stack Exchange](http://cs.stackexchange.com/). – Spencer Wieczorek Mar 06 '15 at 07:07
  • @cr001 Your formulation still looks a bit strange to me. The input of your search algorithm should be the number to be searched **and** the binary tree. Your binary tree clearly has dependence on the input number `h` (for example, it is highly likely you will be working on different trees for different input numbers (say between `h` and `1000 * h`). So the binary tree **should** really be an input to your search algorithm as well. – chiwangc Mar 06 '15 at 07:15
  • I think talking about NP-completeness in this case is rather misleading. It appears you are concerning with the *lower bound* of the search algorithm on unsorted array (or binary tree), I think you are trying to argue that search algorithm cannot be done in time `log(n)` for unsorted array. If so, please refer to the comment of my answer for the corresponding argument. – chiwangc Mar 06 '15 at 07:16
  • Let's say we are working on the same tree each time for different numbers. If we consider the tree as input then the binary search would also take O(2^h) or O(n) time as well... how to distinguish the binary search then? – cr001 Mar 06 '15 at 07:26
  • For your later comment, are you talking about constructing the case where an element is at the entry not visited? Then I think I can use the same argument for sorted array as well to prove O(log n) binary search does not exist which is obviously wrong... The point is that for sorted array I can know some elements are not satisfying without even visiting them. But how do I prove there can be no way to do similar things like a binary search for unsorted array? I am feeling proving this is somehow similar to proving P!=NP, that's why I am trying to reduce some NP complete problem to this problem – cr001 Mar 06 '15 at 07:32
  • @cr001 No, binary search takes time `O(h) = O(log(n))` on sorted array since the ordering property of the nodes ensure that you don't need to visit *every* nodes of the tree but only a path from the root to a leaf. In contrast, for unsorted array, you can argue that the lower bound is `Ω(n)` since we must visit **all** the nodes to guarantee the search algorithm works correctly for **all** possible binary trees (including sorted ones). The argument I mentioned earlier *does not* extend to the sorted binary tree case since the way we construct the tree *does not* guarantee that it *is* sorted – chiwangc Mar 06 '15 at 08:15
  • This is the thing I don't understand. Why do we have to visit all the nodes to guarantee the algorithm works for all trees? We cannot just say there isn't a quick algorithm that goes through only one path for unsorted trees because we cannot come up with one. – cr001 Mar 09 '15 at 06:14
  • Cross-posted on CS.SE: http://cs.stackexchange.com/q/40219/755. Please don't cross-post on multiple SE sites; that is forbidden by site rules, and it is impolite to answerers as it fragments discussion. – D.W. Mar 09 '15 at 16:57
  • @SpencerWieczorek, if you're going to suggest another site, please make sure to remind posters not to cross-post: you can tell them about the option to click "flag" under their post to flag it for moderator attention and ask the moderator to migrate the question. – D.W. Mar 09 '15 at 16:57
  • 1
    I'm voting to close this question as off-topic because it is about computer science, not about programming. It has been [reposted on Computer Science](http://cs.stackexchange.com/questions/40219/array-search-np-completeness). – Gilles 'SO- stop being evil' Mar 09 '15 at 18:23
  • @cr001 As this posted is likely to be closed for cross posting, I am not going to further explain too much details. I have already given a proof in my earlier comment (now added to my answer) to answer your doubt. In our case here, the algorithm is sound only if it works for **all** (unsorted) trees, and it is unacceptable even if there is a case that this algorithm does not give the correct result. My proof states that if the algorithm does not read all the `n` inputs, you can **always** construct a case that violates the correctness of the algorithm. – chiwangc Mar 10 '15 at 01:54

1 Answers1

2

Finding an element in an array is NOT NP-complete as it can be done in linear time. (Assuming P ≠ NP)

In fact, the naive brute-force search algorithm you mentioned in your question is a linear time algorithm!

When we are talking about the complexity of a computational problem, we always measure the time with respect to the size of the input. You claimed the input size of our algorithm is m = log(n), but in our case, the size of our input is determined by the number of elements in the array, which is n.

For your reference, testing whether a given number n is a prime number is an example computational problem that takes input of size log(n). The input of the problem is n, and it is of size log(n) because we need to use log(n) bits to represent n in binary form.

Update

Deterministic search algorithm requires Ω(n) time for unsorted array.

Any search algorithm must read through the entire input (i.e. the n entries of the array). We are going to prove this by contradiction.

Suppose there is a search algorithm that does not read all n input entries, then there is an entry that is not read by this algorithm. you can then construct a case that the search item is at the entry that is not read by this hypothetical algorithm, this violates the correctness of the algorithm. Hence such algorithm does not exist.

chiwangc
  • 3,566
  • 16
  • 26
  • 32
  • Thank you for the answer. If we cannot use NP-completeness as a measure using substitutions such as m = log(n), then is there a way to prove that performing a search in log(n) time for a not sorted array is not possible? – cr001 Mar 06 '15 at 06:08
  • In fact, you can argue something even stronger - any **deterministic** search algorithm requires `Ω(n)` time: Any search algorithm must read through the entire input (i.e. the `n` entries of the array). If not, suppose there is a search algorithm that does not read all `n` input entries. There is an entry that is not read by this algorithm, you can then construct a case that the search item is at the entry that is not read by this hypothetical algorithm, which violates its correctness of the algorithm. Hence such algorithm does not exist. – chiwangc Mar 06 '15 at 06:19
  • I have reworded the problem in a more formal way in the original post (in order to exclude the reading time because even binary search takes O(n) time if we consider reading in the array), if you can read the post again I will really appreciate. – cr001 Mar 06 '15 at 06:34