Custom heuristic in ECLiPSe CLP

Question

Consider the following puzzle:

A cell is either marked or unmarked. Numbers along the right and bottom side of the puzzle denote the total sum for a certain row or column. Cells contribute (if marked) to the sum in its row and column: a cell in position (i,j) contributes i to the column sum and j to the row sum. For example, in the first row in the picture above, the 1st, 2nd and 5th cell are marked. These contribute 1 + 2 + 5 to the row sum (thus totalling 8), and 1 each to their column sum.

I have a solver in ECLiPSe CLP for this puzzle and I am tyring to write a custom heuristic for it.

The easiest cells to start with, I think, are those for which the column and row hint are as low as possible. In general, the lower N is, the fewer possibilities exist to write N as a sum of natural numbers between 1 and N. In the context of this puzzle it means the cell with the lowest column hint + row hint has lowest odds of being wrong, so less backtracking.

In the implementation I have a NxN array that represents the board, and two lists of size N that represent the hints. (The numbers to the side and on the bottom.)

I see two options:

Write a custom selection predicate for search/6. However, if I understand correctly, I can only give it 2 parameters. There's no way to calculate the row + column sum for a given variable then, because I need to be able to pass it to the predicate. I need 4 parameters.
Ignore search/6 and write an own labelling method. That's how I have it right now, see the code below.

It takes the board (the NxN array containing all decision variables), both lists of hints and returns a list containing all variables, now sorted according to their row + column sum. However, this possibly cannot get any more cumbersome, as you can see. To be able to sort, I need to attach the sum to each variable, but in order to do that, I first need to convert it to a term that also contains the coordinates of said variable, so that I convert back to the variable as soon as sorting is done...

lowest_hints_first(Board,RowArr,ColArr,Out) :-
    dim(Board,[N,N]),
    dim(OutBoard,[N,N]),
    ( multifor([I,J],[1,1],[N,N]), foreach(Term,Terms), param(RowArr,ColArr) do
        RowHint is ColArr[I],
        ColHint is RowArr[J],
        TotalSum is RowHint + ColHint,
        Term = field(I,J,TotalSum)
    ),
    sort(3,<,Terms,SortedTerms), % Sort based on TotalSum
    terms_to_vars(SortedTerms,Board,Out), % Convert fields back to vars...
    ( foreach(Var,Out) do
         indomain(Var,max)
    ).

terms_to_vars([],_,[]).
terms_to_vars([field(I,J,TotalSum)|RestTerms],Vars,[Out|RestOut]) :-
    terms_to_vars(RestTerms,Vars,RestOut),
    Out is Vars[I,J].

In the end this heuristic is barely faster than input_order. I suspect its due to the awful way it's implemented. Any ideas on how to do it better? Or is my feeling that this heuristic should be a huge improvement incorrect?

To evaluate your heuristics, the first thing to do is to compare the number of backtracks. When using `search/6`, you can use the `backtrack(Count)` option to obtain this number. With your own search strategy, see http://www.eclipseclp.org/doc/tutorial/tutorial088.html#toc82 for how to compute the equivalent count. — jschimpf, May 24 '16 at 10:20
That's a good idea. I am going to implement it for my own search, though it would be much easier to use `search/6` and pass it my own search predicate instead. Do you have any ideas to make this work when you to pass input to said predicate? I can't get it to work with only 2 param predicates that `search/6` requires. — svdc, May 24 '16 at 11:18
You can just call `search/6` in place of the `foreach(Var,Out)` loop in your code. — jschimpf, May 24 '16 at 14:12
BTW, your sort should look like `sort(3,=<,Terms,SortedTerms)`, otherwise you will lose list elements that have a non-unique sorting key (`TotalSum`). — jschimpf, May 24 '16 at 21:33
Thank you so much for your help. I implemented the tutorial you linked, and it kept giving 0 backtracks for puzzles up to 250x250, which I thought was extremely weird. Turns out the other variable selection heuristics also do it with 0 backtracks. That's with search/6, so it can't be the backtrack implementation. It surprises me constraint propagation is so strong in this case. — svdc, May 25 '16 at 10:30

score 2 · Answer 1 · answered Jul 17 '16 at 09:12

2

I see you are already happy with the improvement suggested by Joachim; however, as you ask for further improvements of your heuristic, consider that there is only one way to get 0 as a sum, as well as there is only one way to get 15. There is only one way to get 1 and 14, 2 and 13; two ways to get 3 and 12. In general, if you have K ways to get sum N, you also have K ways to get 15-N.

So the difficult sums are not the large ones, they are the middle ones.

answered Jul 17 '16 at 09:12

Archangel

83
7

You're absolutely right., I forgot to take into account that the number of columns limits the choices. Thanks! – svdc Jul 18 '16 at 10:58

Custom heuristic in ECLiPSe CLP

1 Answers1