Finding the minimum set of properties that describe a referent in a set of entities

Question

I was wondering if someone could help me get pointers to solve this problem. A link to algorithms would be great, but pointers to papers/info is also good.

The problem is as follows. Suppose I have a set E of entities E={car1, car2, bicycle} and a set of properties P ={red, blue, small}. I also have a knowledge base such that red(bicycle), blue(car1), blue(car2), small(car2). Suppose I also have a referent r which belongs to E.

The problem consists of finding the minimum set of properties P' \subseteq P such that it unequivocally picks out r from E. Thus, if r is car2, then P'={small}.

Any ideas? I guess something like the set covering problem or functional dependencies (as in DB theory) might provide some insight, but I thought I'd ask before going into that literature. Yet another possibility is building graphs and find algorithms for subgraph isomorphisms... maybe.

Thanks.

What is `P'` for `bicycle`? I have two variants: `{blue}` or `{red}`. If we see something `red`, we unambiguously determine it's a `bicycle`. But it's also apparent that if we see something `not blue`, then we can also reason it's a `bicycle`. Is it the case? — P Shved, Oct 22 '09 at 12:35

score 1 · Accepted Answer · answered Oct 22 '09 at 13:10

First find the set of all properties that r has. Call it S. For each property p in S, find e(p), all the entities that have the property p. It is clear for each p in S that e(p) contains r. Now take the intersections of e(p) for each p in S. If the intersection contains more than one entity, there is no solution, and we end the algorithm.

So we have a set S of properties that uniquely determine the entity r. Now we need to find a minimal subset of S that uniquely determines r. We can remove any property p from S for which there exists a property q in S so that e(p) is a superset of e(q). If you do that exhaustively you will eventually end up with a reduced set of properties T so that the intersection of e(p) for all the p in T will still be {r}, but no further property in T can be removed. This set T is then minimal.

I haven't thought of anything to make finding a property you can remove any more efficient than just trying all combinations, but it seems to me that there should be some way.

Hi, Joren. Yes, this was, details aside, my first attempt. As you say, however, it seems there must be a more efficient way to do this. That is what I am after. Thanks, upvoted! :) — Dervin Thunk, Oct 22 '09 at 15:01
The only improvement I've thought of is that you can precalculate/cache the results of subset-comparisons between properties, if you're going to run the algorithm several times for each property-entity system in stead of just once. It's not a massive improvement, and a pretty obvious one, so I suppose you had thought of it. — Joren, Oct 22 '09 at 18:29

Rafał Dowgird · Answer 2 · 2009-10-23T11:02:18.980

You are looking for a minimum set cover of the set E\{r} with negations (complements) of those properties that r belongs to (properties can be treated as subsets of E).

Since these sets can be any sets, this is NP-hard.

More precisely:

Having a set cover instance (U,S) where U is the set that you need to cover and S={s1,s2, ..., sn} is the family of covering sets you can construct an instance of your problem so that its solution gives a set cover in the original problem:

E = U \union {r}, where r is the referent and r does not belong to U. The set of properties P={p1,p2, ..., pn} is constructed from S so that for each e in U and each i such that 1<=i<=n we have pi(e) iff e is not in si. Moreover each property is true for r. In other words properties are complements of the original sets when restricted to U, and r has all properties.

Now it is clear that each set of properties that selects r is a set cover in the original problem - if r is selected by a set S* of properties, then all other elements (that means all those in U) fail at least one property in S*, so every element of U belongs to at least one original set (from construction of properties as complements of the sets). That means that U is union of those sets from which properties in S* were constructed.

The converse is also true - a set cover in U translates to an r-selecting set in E.

The above reduction is obviously polynomial, so the problem is NP-hard.

Finding the minimum set of properties that describe a referent in a set of entities

2 Answers2