16

I started off using an XML file and a parser as a convenient way to store my data

I want to use DTD to check the structure of the xml files when they arrive.

Here is my DTD file

< ?xml version="1.0" encoding="UTF-8"?>
< !ELEMENT document (level*)>
< !ELEMENT level (file,filelName?,fileNumber?)>
< !ELEMENT file (#PCDATA)>
< !ELEMENT filelName (#PCDATA)>
< !ELEMENT fileNumber (#PCDATA)>

(note that fileName and fileNumber are actually purely optional)

and

<document>
 <level>
  <file>group1file01</file>
 </level>
 <level>
  <file>group1file02</file>
  <fileName>file 2</fileName>
  <fileNumber>0</fileNumber>
 </level>
...

as such all this works fine. (I use eclipse "validate" option to test it for now)

however while testing I got what I think is a wierd error

if I do

 <level>
  <levelName>Level 2</levelName>
  <levelNumber>0</levelNumber>
        <file>group1level02</file>
 </level>

changing the order of the lines, Eclipse refuses to validate it ...

I was wondering if this was a problem with Eclipse or if the order is actually important.

If the order is important how can I change the DTD to make it work no matter the ordering of he elements?

I can't really change the XML because I already have all the XML files and the parser written (I know I did it the wrong way round lol).

Rob Kielty
  • 7,958
  • 8
  • 39
  • 51
Jason Rogers
  • 19,194
  • 27
  • 79
  • 112

5 Answers5

9

As Roger said, there are only ordered lists, but you can use operator OR | to define all accepted combinations

<!ELEMENT level ((file,filelName?,fileNumber?)|(filelName?,fileNumber?,file))>

Look here, there is an example in the section Choices

Gaim
  • 6,734
  • 4
  • 38
  • 58
  • hum ... I would have thought it would be more flexible ... I'll have to go with || because I had to do so assumptions on the order the xml is read lol. thanks for the solution – Jason Rogers Jan 20 '11 at 08:11
  • 5
    This is not a valid DTD because it is not deterministic. Even if it were valid, it wouldn't allow the child elements in any possible order. – jasso Oct 20 '11 at 08:49
8

Declaring unordered lists with occurrence constraints in DTD will often result in long or complicated looking declarations. One big reason for this is that DTDs must be deterministic, therefore even switching to XML Schemas don't necessarily help.

Here is a DTD declaration for element <level> that contains:

  • exactly 1 <file> element
  • 0-1 <fileName> elements
  • 0-1 <fileNumber> elements
  • in any possible order

code:

<!ELEMENT level ( (file, ((fileName, fileNumber?) | (fileNumber, fileName?))?)
                 |(fileName, ((file, fileNumber?) | (fileNumber, file)))
                 |(fileNumber, ((file, fileName?) | (fileName, file))) )>
jasso
  • 13,736
  • 2
  • 36
  • 50
6

You can use ANY keyword if you don't bother too much about validity:

<!ELEMENT level ANY>

I have faced a similar problem here, this two cases may appear:

<Instructors>
  <Lecturer>
  </Lecturer>
  <Professor>
  </Professor>
</Instructors>

<Instructors>
  <Lecturer>
  </Lecturer>
  <Professor>
  </Professor>
</Instructors>

The only solution I found was this:

<!ELEMENT Instructors ANY>

Maybe there are a better solution, but it works fine for my particular problem.

rendon
  • 2,323
  • 1
  • 19
  • 25
  • 7
    It will be better to use: <!ELEMENT Instructors (Lecturer | Professor)*> – Alexander Gryanko Jan 11 '14 at 00:12
  • So it seems. A year ago I couldn't find a better idea. – rendon Jan 11 '14 at 01:37
  • 1
    It seems you were doing the XML Exercise of the Stanford Database MOOC by Jennifer Widom...I got here because of the same problem :-) With <!ELEMENT Instructors (Lecturer | Professor)*> I get no error anymore from xmllint. – Suzana Jan 14 '14 at 13:24
  • Thanks @Suzana_K & Alexander_Gryanko. I was including the operators after the attributes inside the parenthesis... – chaps Sep 24 '14 at 15:53
4

With a DTD the child nodes have to appear in the order listed in the element definition. There is no way to allow for alternative orderings, unless you want to upgrade to an XSD schema.

Addendum: Per @Gaim, you can offer alternative orders using the (a,b,c...)|(b,a,c...) syntax, but this is not really practical for more than, say, 3 nested elements, since an arbitrary order allows for a factorial number of orderings -- 6 for 3 elements, 24 for 4 elements, 120 for 5 elements -- and clever use of ? operators is sure to result in false validation for strange cases.

Roger Halliburton
  • 1,965
  • 3
  • 14
  • 18
  • its for a really small system so I don't think it would be interesting to change to XSD but I'll look into it thanks – Jason Rogers Jan 20 '11 at 08:09
  • 1
    Not strictly true. You can allow for alternate orderings, you just have to explicitly list all the ones you want to allow. There isn't a great deal of difference between the ordering rules allowed in XML schema and a DTD, it's simply less painful to express them in a schema. – Nic Gibson Jan 20 '11 at 11:19
0

If you can guess sensible upper-bound for the number of children for your element, than there is extremely dirty way how to overcome the problem. Follows the example for 0-3 children:

<!ELEMENT myUnorderedElement ( (option1 | option2 | option3)?, (option1 | option2 | option3)?, (option1 | option2 | option3)? >

Thus, you allow the element "myUnorderedElement" to have 0-3 children of any of type option1, option2 or option3.

Kefik
  • 101
  • 6