I know the intuition behind constraint programming, so to say I never really experienced programming using a constraint solver. Although I think it is a different situation to be able to achieve what we would define as consistent data.
Context:
We have a set of rules to implement on a ETL server. These rules are either:
- acting on one row.
- acting inter-rows, in one or different tables.
- acting the same way between two runs (It should maintain the same constraint on all data, or just the last n runs);
The third case is different from the second, as it holds when the 2nd case holds but for a well defined number of runs. It might be applied for one single run (one file), or between (1 to n (previous) or on All files).
Technically as we conceived the ETL, it has no memory between two runs: two files (but this is to be re-thought)
For the application of the third kind of rule, ETL needs to have memory (I think we would end-up back-upping data in ETL); Or by re-checking infinitely (a Job) on the whole database after some time window, So data ending up in database do not necessarily fulfill the third kind of rule in-time.
Example:
While we have a continuous flowing data, we apply constraints to have a whole constrained database, the next day we will receive a backup or a correction data for say one month, for this time window, we would like to have constraints satisfied for only this run (this time window), without worrying about the whole database, for future runs all data should be constrained like before without worrying about past data. You can imagine other rules that could fit Temporal logic.
For now, we only have the first kind of rules implemented. The way I thought of it is to have a minified database (of any kind: MySQL, PostgreSQL, MongoDB ...) that back-up all Data (only constrained columns, probably with hashed values) with flags referring to consistency based on earlier kind of rules.
Question: Are there any solutions / conception alternatives that would ease this process ?
To illustrate in a Cook programming language; An example of a set of rules and following actions:
run1 : WHEN tableA.ID == tableB.ID AND tableA.column1 > tableB.column2
BACK-UP
FLAG tableA.rule1
AFTER run1 : LOG ('WARN')
run2 : WHEN tableA.column1 > 0
DO NOT BACK-UP
FLAG tableA.rule2
AFTER run2 : LOG ('ERROR')
Note: While constraint programming is in theory a paradigm for solving combinatorial problems and in practice can speed problem development and execution; I think this is different than a constraint solving problem; As the first purpose is not for optimizing constraints before resolution, probably not even limiting data domains; It's main concern is to apply rules on data reception and execute some basic actions (Reject a line, Accept a line, Logging...).
I really hope this is not a very broad question and this is the right place.