5

A question about 20 questions games was asked here:

However, if I'm understanding it correctly, the answers seem to assume that each question will go down a hierarchal branching tree. A binary tree should work if the game went like this:

  1. Is it an animal? Yes.
  2. Is it a mammal? Yes.
  3. Is it a feline? Yes.

Because feline is an example of a mammal and mammal is an example of an animal. But what if the questions go like this?

  1. Is it a mammal? Yes.
  2. Is it a predator? Yes.
  3. Does it have a long nose? No.

You can't branch down a tree with those kinds of questions, because there are plenty of predators that aren't mammals. So you can't have your program just narrow it down to mammal and have predators be a subset of mammals.

So is there a way to use a binary search tree that I'm not understanding or is there a different algorithm for this problem?

Just to clarify, I'm only using 20 questions as an example, so my question is about this kind of search problem in general, not other problems involved specifically in a 20 questions game.

Community
  • 1
  • 1
lala
  • 2,052
  • 6
  • 24
  • 33
  • 1
    It is even more tricky when you have to take the fact that people answer consistently incorrectly into account, if for instance a lot of people think that dolphins are fish... Thats why you need some more interconnected approach, like ANN or other machine learning. – Alexander Torstling May 14 '10 at 16:26
  • Thanks, but I'm just using 20 questions as an example for a situation where you need to find which object matches a bunch of properties. So for the sake of this question, I would be happy to assume that you always get the correct answer. I edited my question to try and make that clearer. – lala May 14 '10 at 16:28

4 Answers4

2

It's likened to a binary search in that each question is yes/no, and so every answer partitions your remaining set into two parts. However, the data set would likely not be stored in an actual binary tree, because as you realize, that'd only work if the questions were always asked in the same order as the tree split dimension.

Also, you could easily have more than exactly 20 dimensions ('properties') on which to split things, and some set of those twenty could be shared by more than one object (so the leaf node of a 20-level binary tree wouldn't necessarily contain just one item).

Thus, the "binary search" is just a metaphor for what's actually going on, in that at each step you try to pick the property which best splits your remaining set into two equal halves. As far as actual data structures go, you'd have to use something else.

tzaman
  • 46,925
  • 11
  • 90
  • 115
0

If you needed to stick with a binary tree for the problem, there's nothing saying that you can't duplicate a branch or a node. Place the feline answer node at the end of more than one set of decisions. Or ask the predator question twice - once if the user said "yes" to mammal, and once if the user said "no".

Certainly there are optimization and efficiency concerns if you take this tack, but there are ways of addressing specific concerns as well. (For example, if you're worried about storage space for the decision tree, then make the branches or the nodes or both pointers to immutable objects/declarations).

G__
  • 7,003
  • 5
  • 36
  • 54
  • Won't the tree grow exponentially that way and get huge really fast? I'm just a beginner, but it seems like it would easier to just iterate over every single possible answer and check them one at a time than do that. – lala May 14 '10 at 16:44
0

I believe what you're looking for is more commonly referred to as a Decision Tree, specifically for classification. You can then use algorithms like C4.5 to learn how to order your questions to classify efficiently.

Brad
  • 53
  • 1
  • 3
-1

If you are looking for an exact match - just hash on all the properties and do a lookup.

If you want to do pattern recognition to find similar items you can use a method with a quite 'linear' mapping - like k-nearest neighbour. You can for instance use a kd-tree to represent the search space.

Alexander Torstling
  • 18,552
  • 7
  • 62
  • 74