-1

I am considering writing a piece of code (script if possible). To convert a human readable specification (DICOM) into a machine parsable validation.

The DICOM standard uses DocBook (XML) to define relationship for its elements and attributes, for example:

The equivalent DocBook XML for Scanning Sequence is:

          <tr valign="top">
            <td align="left" colspan="1" rowspan="1">
              <para>Scanning Sequence</para>
            </td>
            <td align="center" colspan="1" rowspan="1">
              <para>(0018,0020)</para>
            </td>
            <td align="center" colspan="1" rowspan="1">
              <para>1</para>
            </td>
            <td align="left" colspan="1" rowspan="1">
              <para>Description of the type of data taken.</para>
              <variablelist spacing="compact">
                <title>Enumerated Values:</title>
                <varlistentry>
                  <term>SE</term>
                  <listitem>
                    <para>Spin Echo</para>
                  </listitem>
                </varlistentry>
                <varlistentry>
                  <term>IR</term>
                  <listitem>
                    <para>Inversion Recovery</para>
                  </listitem>
                </varlistentry>
                <varlistentry>
                  <term>GR</term>
                  <listitem>
                    <para>Gradient Recalled</para>
                  </listitem>
                </varlistentry>
                <varlistentry>
                  <term>EP</term>
                  <listitem>
                    <para>Echo Planar</para>
                  </listitem>
                </varlistentry>
                <varlistentry>
                  <term>RM</term>
                  <listitem>
                    <para>Research Mode</para>
                  </listitem>
                </varlistentry>
              </variablelist>
              <note>
                <para>Multi-valued, but not all combinations are valid (e.g., SE/GR, etc.).</para>
              </note>
            </td>

So I would need to parse this XML InfoSet, and generate schematron rules from this set of DICOM keyword. What kind of language can I use to be both efficient and accurate ? The language should allow easy parsing of XML input and easily generate schematron rules.

malat
  • 12,152
  • 13
  • 89
  • 158

1 Answers1

1

As far as I can see, this should be pretty straightforward, so I'd say use any language you like. XML parsers are available everywhere, and generating the schematron is easy enough with sprintf() or whatever. Overall it might be easiest to do the transform in XSLT, because you get both parsing and generation for free. But if you don't know XSLT, a language you already know will likely be quicker for you.

-s

PS: If you happen to use Python, be careful about which parser library you use. I have not found all of them reliable. But your data looks clean enough that you won't stress them that much, so you should be ok.

TextGeek
  • 1,196
  • 11
  • 23