0

I have been banging my head against the wall with this for a while now, and I'm no closer to an answer than I was at the start.

I'm trying to create an XML Schema file to allow for more accurate validation of AIML files, based on the AIML 1.0.1 specification, and I've run into a snag. According to the specification, the <pattern> and (input-side) <that> tags can only allow a couple of child elements (<bot> and/or <eval>) -or- CDATA that can only consist of:

  • alphanumeric characters (a-z, A-Z, or 0-9)
  • spaces
  • one (or both) of two 'wildcard' characters ( * or _ )

Examples of both valid and invalid <pattern> tags might look like this:

<!-- valid PATTERN -->
<pattern>HELLO</pattern>
<pattern>HELLO *</pattern>
<pattern>_ IS FOR SALE</pattern>

<!-- invalid PATTERN -->
<pattern>HOW ARE YOU TODAY?</pattern> <!-- note the question mark -->
<pattern>50%</pattern> <!-- note the percent sign -->

By the way, the current state of the XSD for the <pattern> tag is below, and works to restrict the CDATA to the desired list of characters:

<xs:element name="pattern">
  <xs:complexType mixed="true">
    <xs:simpleContent>
      <xs:extension base="aiml:InputPatternType">
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>

<xs:simpleType name="InputPatternType">
  <xs:restriction base="xs:string">
    <xs:pattern value="[\w| |_|\*]*"/>
  </xs:restriction>
</xs:simpleType>

However, this does not allow the use of the necessary child elements, the code for which is here:

<xs:choice minOccurs="0" maxOccurs="unbounded">
  <xs:element ref="aiml:bot"/>
  <xs:element ref="aiml:eval"/>
</xs:choice>

When I try to incorporate this bit into the declaration for the pattern element, I get all sorts of errors, depending on how I'm trying to add this code, from "unexpected child element" to complaints about "if <complexType> alternative is chosen...", etc. I've read several articles (the most helpful being this one, but as I mentioned, I'm no closer to solving this than when I had started, last week. I know I'm just missing something simple, but I just can't see it, and none of the many SO articles related to this have netted me any usable results, as they mostly deal only with child elements, and not with how to restrict CDATA to certain characters.

I'm well aware of the fact that the above code uses simpleType and simpleContent instead of complexType and complexContent, but my efforts to use those have met with no success at all, so I posted what is currently (partially) working. Any help with this would be humbly appreciated. Thanks.

Dave Morton
  • 671
  • 6
  • 16

1 Answers1

1

The design you describe cannot be translated without loss into XSD content models: to use the pattern facet to restrict the set of legal characters, your element must have simple content (i.e. a simple type or a complex type with simple content -- complex, then, only if it can carry attributes); to allow child elements bot and eval, the element must have complex content.

Among your options:

  1. Declare a complex type with mixed content which allows bot and eval, and use XSD 1.1 assertions to constrain the characters found in the character content.

  2. Define an XML representation which is not exactly what you describe, but which can be validated accurately and which maps 1:1 to what you describe: instead of defining a single pattern element, declare two named input-pattern and bot-eval-pattern (or whatever you like). Define input-pattern as having your InputPatternType, and bot-eval-pattern as having a complex type with the optional repeating choice of aiml:bot and aiml:eval.

    If you wish, you can define an abstract element named pattern and name it as the substitution-group head for the two concrete elements input-pattern and bot-eval-pattern; this allows the other content models which name pattern to mention just pattern instead of mentioning the two concrete types.

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
  • Thanks for your answer. I've upvoted it because it gave me a direction to research, but as it didn't actually help solve the problem I can't accept it as an answer. I'm still working on this, though, so I'm hopeful. – Dave Morton Jun 17 '16 at 21:15
  • I should have included an explanation in my comment. Sorry. Sadly, option 2 is not viable for this project, since I'm working within an already existing specification that only allows for a `` element, and nothing else. Option 1 may be possible, but it's beyond my current experience, so I'll have to learn a bit more before I can try to implement it. – Dave Morton Jun 17 '16 at 21:23