3

Let's say I have these three predicates:

Predicate<int> pred1 = x => x > 0;
Predicate<int> pred2 = x => x > 0 && true;
Predicate<int> pred3 = x => false;

From a human point of view, it's trivial to say that pred1 and pred2 are equivalent while pred3 is not. By equivalent, I mean that for every possible input value, the value outputted by both pred1 and pred2 will be the same.

I would like to calculate a unique hash for a given predicate ; two predicates being equivalent should have the same hash (like pred1 and pred2), two predicates being not equivalent shall not (like pred1 and pred3).

Has it already been done before (again, in a .NET language)? I know side effects are basically the bane of such analysis ; but if we "forbid" side-effects, can it be done in .NET (swiftly)?

What would be the best approach to this requirement?

Max
  • 3,453
  • 3
  • 32
  • 50
  • 9
    Welcome to the wonderful world of the halting problem. Good luck. – SLaks Jun 13 '14 at 13:11
  • On a more serious (ie, more restrictive) note, that's exactly what an optimizer does. You can write an ExpressionVisitor that simplifies known redundancies like that one. – SLaks Jun 13 '14 at 13:12
  • 1
    As slacks has alluided to, this is provably impossible in the general case. The best you can hope for is an approximation that has some false positives/false negatives. – Servy Jun 13 '14 at 13:14
  • 1
    @SLaks - on the contrary, the halting problem doesn't apply here (assuming pure predicates, as posed in the problem statement) - you can just run each predicate on all `int` values and check if the answers are always the same, so it's decidable. Of course, the fact that there are 2^(2^32) possible predicates means that you won't find a hash that always separates different predicates. – kvb Jun 13 '14 at 14:30
  • @SLaks - more interestingly, assuming pure functions there's even a decision procedure for the equality of predicates on infinite streams of bits - see http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/. – kvb Jun 13 '14 at 14:39
  • @kvb No, you can't run it on all possible ints. What if one predicate doesn't complete? And of course, just because it's taking a really really long time, how will you know if it will ever complete. You've now put yourself in the position of needing to answer the halting problem. And of course it's easy enough to write predicates that don't complete, or that take a very, very, very long time to complete (long enough that you cannot, by executing the program, determine if it ever will complete). – Servy Jun 13 '14 at 15:31
  • 2
    Depends whether you treat non-termination as a side-effect :-). If there are no side-effects _including_ no non-termination, then it is certainly (theoretically) doable. – Tomas Petricek Jun 13 '14 at 15:40
  • @TomasPetricek Not terminating is not a side effect though, and no side effects need to be used to create non-terminating code. – Servy Jun 13 '14 at 15:47
  • @Servy This really depends on your definitions. In many cases, it is reasonable to treat non-termination as a side-effect. – Tomas Petricek Jun 13 '14 at 16:02
  • @TomasPetricek And yet choosing to not terminate affects no state external to the method. And on top of that. On top of that, there's nothing in the question that even limits the methods to not having side effects in the first place. – Servy Jun 13 '14 at 16:06
  • 1
    @Servy - perhaps I should have said "total" instead of "pure" to avoid controversy, but many people do treat non-termination as a form of impurity. When viewed through the lens of Curry Howard, partiality results in a logic where anything can be proved. – kvb Jun 13 '14 at 16:16

1 Answers1

8

As already mentioned in the comments, solving this is theoretically impossible - at least in the general case when the predicates can run code that may not terminate (e.g. a recursive call), meaning that there is a proof that you cannot ever implement a program that will be able to do this correctly on all inputs.

In practice, it really depends on what you want to do. If you want to apply a couple of simple rules to simplify the predicates, then you can do that. It won't handle all situations, but it may as well handle the cases that actually matter to you.

Since F# is inherited from the ML-family of languages (which have been pretty much designed for solving these kinds of problems), I'm going to write a simple example in F#. In C#, you could do the same using a visitor over expression trees, but it would probably be 10x longer.

So, using F# quotations, you can write two predicates as:

let pred1 = <@ fun x -> x > 0 @>
let pred2 = <@ fun x -> x > 0 && true @>

Now, we want to walk over the expression tree and perform some simple reductions like:

if true then e1 else e2   ~> e1
if false then e1 else e2  ~> e2
if e then true else false ~> e

To do that in F#, you can iterate over the expression recursively:

open Microsoft.FSharp.Quotations

// Function that implements the reduction logic
let rec simplify expr =
  match expr with
  // Pattern match on 'if then else' to handle the three rules
  | Patterns.IfThenElse(Simplify(True), t, f) -> t
  | Patterns.IfThenElse(Simplify(False), t, f) -> f
  | Patterns.IfThenElse(cond, Simplify(True), Simplify(False)) -> cond      

  // For any other expression, we simply apply rules recursively
  | ExprShape.ShapeCombination(shape, exprs) ->
      ExprShape.RebuildShapeCombination(shape, List.map simplify exprs)
  | ExprShape.ShapeVar(v) -> Expr.Var(v)
  | ExprShape.ShapeLambda(v, body) -> Expr.Lambda(v, simplify body)

// Helper functions and "active patterns" that simplify writing the rules    
and isValue value expr = 
  match expr with
  | Patterns.Value(v, _) when v = value -> Some()
  | _ -> None

and (|Simplify|) expr = simplify expr
and (|True|_|) = isValue true
and (|False|_|) = isValue false

When you now call simplify pred1 and simplify pred2, the result is the same expression. Obviously, I cannot fit a complete description into a single answer, but hopefully you get the idea (and why F# is really the best tool here).

ildjarn
  • 62,044
  • 9
  • 127
  • 211
Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553