5

I am working on a service that transforms, translates and normalizes records received as semi-structured json. The requirements are as follows:

  • The incoming json entities of the same type (type - person, address etc.) may not to have the same same attributes.
  • Some attributes may not be present in every entity of a given type.
  • Attributes can be renamed.
  • The incoming json entites are initially untyped. The type of the incoming entities can be determined by analyzing the available fields. So I imagine that rules are needed to reclassify entities to their Drools/Java class.
  • It may not be possible to guarantee that the data in a given attribute is always of the same type (though everything can default to string).

Of course, these requirements are all the opposite of Java and comments in other posts (though a several years ago) have pointed out that it is difficult to process json with Drools.

Is there way to harmoniously apply Drools in the above scenario or are there minimal restrictions (aside from the obvious solution of imposing a strong data model) that would correct the situation?

Community
  • 1
  • 1
Toaster
  • 1,911
  • 2
  • 23
  • 43
  • I don't know why you pose this question. Apparently you know very well that the data you describe is (or can be) very messy. Also, you have a clear understanding of Drools being based on Java classes. You must have inferred that only hard programming work will provide a solution. If you can formulate rules for normalizing this JSON data, you can code it, if not, you can't. – laune Nov 12 '14 at 12:35
  • I pose the question because maybe there is a way. Considering JSON's popularity there may well be someone who has solved the problem by now. – Toaster Nov 12 '14 at 12:42
  • It's possible that a similarly messy setup has been handled before, but how would that help with your mess? Asking for a generic, configurable solution that can be applied to any mess is hoping for very much (and asking in the wrong forum). – laune Nov 12 '14 at 12:49
  • I respectfully disagree. – Toaster Nov 12 '14 at 13:15
  • I have added a vote to close as this clearly shows that stackoverflow is not intended for running searches for software. Disagreeing with it won't help. – laune Nov 12 '14 at 13:19
  • Well, this is not a software search. This is a request for a solution to a programming problem. You seem threatened by the question. If Drools isn't up to the task then just answer as such and maybe that will be the best answer. – Toaster Nov 12 '14 at 15:02
  • Programming problems are solved by algorithms based on specifications (or, indeed, by existing software meeting those specs). You have vaguely described a "situation", but this isn't a specification. If you have one, and it clearly states all the rules what to do with a certain JSON data item: then this task can be coded. If the nature of these rules makes it advisable to use a (any) rule based system, then I'm sure that Drools is well equipped to handle it. No RBS will have a "feature" for efforlessly solving your problem – laune Nov 12 '14 at 15:51
  • There are rules based systems that play well with json. I would rather use Drools if it can be made to do so as well. However, I don't know how this would be done, but the world of Java and Drools is large and creativity abounds. (btw - a list of requirements is a partial specification, and sufficient for this question.) – Toaster Nov 12 '14 at 18:35
  • "play well with JSON" I can believe, but is this what you need? You don't have a clean set of JSON objs, you have heap of JSON-formatted data items. – laune Nov 13 '14 at 17:37
  • "may not ... attributes" (bullet #1) - What makes #2 different from #1? - How would you know about diff. attr names referring to the same attr (#3)? - How would you know which type to infer from which attrs (#4), what is necessary (#2) and what happens if there are contradictions? Is it necessary to investigate all JSON entities mapped to a type before the type of a field can be deduced from parsing the value text (#5). What about arrays with mixsd typed values? I don't see how any RBS (or any other program) can be applied without answers to these questions (and quite a few more, I'd wager). – laune Nov 13 '14 at 17:37

1 Answers1

1

I can think of a few approaches that might work for you:

  1. Parse your JSON facts into a memory structure (with something like Gson or Jackson), and insert those structures as facts into drools. Then it should be possible to write rules with LHS that can match the parsed facts. It would also be possible to update the facts through the Gson/Jackson API.
  2. It's possible to write drools facts directly in Java by creating instances of (if I recall) the RuleImpl class. You can then provide an arbitrary LHS that could parse/match arbitrary JSON however you'd like.
ungood
  • 582
  • 6
  • 9
  • Is it possible to use Drools to rename attributes? It seems one would need a way to tell the engine which attributes to export after the new attributes had been populated. – Toaster Jan 26 '15 at 14:18