1

Part of my config-file is as follows:

<factFile name="Apps.xml">
  <directory>/home/<account>/Werk/Divers/Prolog/XMLdata/</directory>
  <field>apl_id</field>
  <field>dns_id</field>
  <field>apl_naam_kort</field>
</factFile>

<factFile name="Dienst.xml">
  <directory>/home/<account>/Werk/Divers/Prolog/XMLdata/</directory>
  <field>dns_id</field>
  <field>dns_afkorting</field>
  <field>dns_naam</field>
</factFile>

Each factFile is created by MySQL (mysql -u username -p -X -e 'use schema; select-statement' > Apps.xml)

The number of factFiles can change, as do the number of fields. What I want is to convert the content (values) from each datafile to facts. So

<row>
  <field name="apl_id">1</field>
  <field name="dns_id">7</field>
  <field name="apl_naam_kort">Risk</field>
</row>

should be converted to

assertz(apps(1, 7, Risk)).

What is the best approach to realize this?

Ben Engbers
  • 433
  • 3
  • 12

1 Answers1

2

Since the number of fields may change per row and you are using SWI-Prolog, I believe that using library record (Wielemaker & O'Keefe) is a good approach here. It allows a subset of predicate arguments to be specified and performs type checking, catching some potential errors early on.

Since I do not know your XML Schema here, I have specified 3 sample field arguments:

  1. apl_id of type integer.
  2. dns_id with default value 0 and of type non-negative integer (i.e., nonneg).
  3. apl_naam_kort of type atom.

It is easy to extend the record/1 declaration with additional field names. The arity of the dynamic/1 declaration would have to be upped accordingly.

Since SWI-Prolog comes with very good Web standards support (wise choice to use SWI for this!) it is straightforward to load XML DOM from file(s) (i.e., load_xml/3) and match rows and fields using XPath-like statements (i.e., xpath/3).

:- module(fact_file, [load_fact_file/1]).

:- use_module(library(record)).
:- use_module(library(sgml)).
:- use_module(library(xpath)).

:- record(apps(apl_id:integer, dns_id:nonneg=0, apl_naam_kort)).
:- dynamic(apps/3).

load_fact_file(File):-
  load_xml(File, Dom, []),
  forall(
    xpath(Dom, //row, Row),
    (
      findall(
        NVPair,
        (
          xpath(Row, //field(@name=Name,text), Value1),
          value_conversion(Value1, Value2),
          NVPair =.. [Name,Value2]
        ),
        NVPairs
      ),
      make_apps(NVPairs, Apps),
      assertz(Apps)
    )
  ).

value_conversion(Atom, Number):-
  atom_number(Atom, Number), !.
value_conversion(Atom, Atom).

Example use:

?- load_fact_file(<FILE-PATH>\test.xml').
true.
?- listing(apps).
:- dynamic fact_file:apps/3.
fact_file:apps(1, 7, 'Risk').
fact_file:apps(_, 0, 'Low Risk').
fact_file:apps(1, 7, _).

Contents of file test.xml:

<table>
  <row>
    <field name="apl_id">1</field>
    <field name="dns_id">7</field>
    <field name="apl_naam_kort">Risk</field>
  </row>
  <row>
    <field name="apl_naam_kort">Low Risk</field>
  </row>
  <row>
    <field name="apl_id">1</field>
    <field name="dns_id">7</field>
  </row>
</table>

Notice that missing arguments for which we did not specify a default value now appear as unnamed variables. This is because Prolog has no null value.

Possible improvements w.r.t. the above code:

  1. Integrate value conversion into library record.
  2. Allow fields in library record to be specified in pair notation (i.e., Name-Value or Name=Value) in addition to predicate notation (i.e., Name(Value)). This allows us to leave out code line NVPair =.. [Name,Value2].
  3. It is possible to update the record/1 declaration dynamically. This may be needed in case the set of field names is very large, not known in advance, and/or changing over time.
  4. If an XML Schema is given that uses XML Schema Datatypes (XSD) the value conversions can be automatically derived, e.g., xsd:nonNegativeInteger -> nonneg.
Wouter Beek
  • 3,307
  • 16
  • 29
  • Thanks for your answer. I have another question on how to use the 'content'argument in the xpath-predicate. I tried to pose this question on you site but the contact-form returns with an error (no storage defined) – Ben Engbers Oct 20 '14 at 11:58