8

We have to solve a difficult problem where we need to check a lot of complex rules from multiple sources against a system in order to decide if the system satisfy those rules or how it should be changed to satisfy them.

We initially started using Constraint Satisfaction Problems algorithms (using Choco) to try to solve it but since the number of rules and variables would be smaller than anticipated, we are looking to build a list of all possibles configurations on a database and using multiple requests based on the rules to filter this list and find the solutions this way.

Is there limitations or disadvantages of doing a systematic search compared to using a CSP solver algorithms for a reasonable number of rules and variables? Will it impact performances significantly? Will it reduce the kind of constraints we can implement?

As examples :

You have to imagine it with a much bigger number of variables, much bigger domains of definition (but always discrete) and bigger number of rules (and some much more complex) but instead of describing the problem as :

x in (1,6,9)
y in (2,7)
z in (1,6)
y = x + 1
z = x if x < 5 OR z = y if x > 5

And giving it to a solver we would build a table :

X | Y | Z
1   2   1
6   2   1
9   2   1
1   7   1
6   7   1
9   7   1
1   2   6
6   2   6
9   2   6
1   7   6
6   7   6
9   7   6

And use queries like (this is just an example to help understand, actually we would use SPARQL against a semantic database) :

SELECT X, Y, Z WHERE Y = X + 1 
INTERSECT 
SELECT X, Y, Z WHERE (Z = X AND X < 5) OR (Z = Y AND X > 5)
  • [On topic](http://stackoverflow.com/help/on-topic), [how to ask](http://stackoverflow.com/help/how-to-ask), and [... the perfect question](https://codeblog.jonskeet.uk/2010/08/29/writing-the-perfect-question/) apply here. Your problem is not specified nearly well enough for someone else to evaluate alternatives. – Prune Jan 08 '19 at 18:04
  • 1
    Perhaps you could clarify this with a toy example that illustrates the algorithmic problems. We can mentally extrapolate a four-factor problem to a much larger space. Show how each of the solutions would apply to your small problem. – Prune Jan 08 '19 at 18:05
  • 1
    Reformulated and I added examples to help understand the question. – Nyamiou The Galeanthrope Jan 09 '19 at 17:47
  • Much better; thanks. Now, someone with better familiarity in the area can pick up the question. – Prune Jan 09 '19 at 18:10
  • Why would you write all the configurations into a database and query instead of just generating all the configurations and checking them against the constraints? – Matt Timmermans Jan 10 '19 at 13:22
  • We would generate all the configurations from the data once a day (since it will only change once a day) and reuse it multiple times (we need to test an important number of system each day and the rules changes between cases). – Nyamiou The Galeanthrope Jan 10 '19 at 15:26
  • The example looks broken: `y = x + 1` vs. `SELECT X, Y, Z WHERE X = Y + 1`. What is the significance of each table row? The only one complying with constraints above is the first?! – greybeard Jan 10 '19 at 16:41
  • @greybeard Whoops sorry, I fixed the mistype. If you imagine a simple doghouse made of parts that you assemble yourself, you can choose the type of opening (square, rounded on top, ...), type of roof (flat, inverted V, ...), and the type of window (no window, simple square window, round window, ...), those information would be the columns and the rows would be every possible types of doghouse you can make with these. Then the constraint would check that the doghouse is possible to build (parts are compatible), match regulations and the dog preferences. – Nyamiou The Galeanthrope Jan 10 '19 at 17:58
  • The constraints I provided in the examples are nothing like the one we'll use, but I didn't want the examples to be too complex or people wouldn't understand the concept. – Nyamiou The Galeanthrope Jan 10 '19 at 18:00

1 Answers1

5

CSP allows you to combine deterministic generation of values (through the rules) with heuristic search. The beauty happens when you customize both of those for your problem. The rules are just one part. Equally important is the choice of the search algorithm/generator. You can cull a lot of the search space.

While I cannot make predictions about the performance of your SQL solution, I must say that it strikes me as somewhat roundabout. One specific problem will happen if your rules change - you may find that you have to start over from scratch. Also, the RDBMS will fully generate all of the subqueries, which may explode.

I'd suggest to implement a working prototype with CSP, and one with SQL, for a simple subset of your requirements. You then will get a good feeling what works and what does not. Be sure to think about long term maintenance as well.

Full disclaimer: my last contact with CSP was decades ago in university as part of my master's (I implemented a CSP search engine not unlike choco, of course a bit more rudimentary, and researched a bit on that topic). But the field will certainly have evolved since then.

AnoE
  • 8,048
  • 1
  • 21
  • 36