1

Hello I came across a problem regarding to restrict the occurrence of an element according to its position in the document. Actually "position" might not be a proper term but I couldn't think of a better way to summarise the problem. Anyway let me explain a bit about the business logic. The schema will be designed for a play script. In a play, one actor can play multiple roles. A play contains many scenes, in a scene, there are several occurrences of stage-direction(Enter/Exit), between two stage-directions, actors give speeches. A sample xml instance would look like this:

<PLAY>
    <CAST>
        <ROLE>
            <ACTOR id="1">Alex</ACTOR>
            <PERSONA>Superman</PERSONA>
        </ROLE>
        <ROLE>
            <ACTOR id="1">Alex</ACTOR>
            <PERSONA>Batman</PERSONA>
        </ROLE>
        <ROLE>
            <ACTOR id="2">John</ACTOR>
            <PERSONA>Hulk</PERSONA>
        </ROLE>
    </CAST>
    <TITLE>Lego Movie</TITLE>
    <SCENE>
        <TITLE>SCENE I</TITLE>
        <STAGEDIR>Enter Superman, Hulk</STAGEDIR>
        <SPEECH>
            <SPEAKER actorID="1">Superman</SPEAKER>
            <LINE>Hahaha</LINE>
        </SPEECH>
        <SPEECH>
            <SPEAKER actorID="2">Hulk</SPEAKER>
            <LINE>Hahaha</LINE>
        </SPEECH>
        <STAGEDIR>Exit Superman, Enter Batman</STAGEDIR>
        <SPEECH>
            <SPEAKER actorID="1">Batman</SPEAKER>
            <LINE>Yo</LINE>
        </SPEECH>
        <SPEECH>
            <SPEAKER actorID="2">Hulk</SPEAKER>
            <LINE>Yo</LINE>
        </SPEECH>
    </SCENE>
</PLAY>

The restriction required here is that if an actor is playing multiple roles, it is not allowed for two persona to be talking to each other on the stage at the same time (if an actor is playing multiple roles, an actor's persona cannot appear on the stage at the same time). For example Alex is playing both Superman and Batman, they cannot appear on the stage at the same time since they are played by a same person. Currently the schema looks like:

<xs:complexType name="PLAYTYPE">
    <xs:sequence>
        <xs:element ref="CAST"/>
        <xs:element ref="TITLE"/>
        <xs:element ref="SCENE"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="CASTTYPE">
    <xs:sequence>
        <xs:element ref="ROLE" maxOccurs="unbounded"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="ROLETYPE">
    <xs:sequence>
        <xs:element ref="ACTOR"/>
        <xs:element ref="PERSONA"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="SCENETYPE">
    <xs:sequence>
        <xs:element ref="TITLE"/>
        <xs:element ref="STAGEDIR"/>
        <xs:element ref="SPEECH" maxOccurs="unbounded"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="ACTORTYPE">
    <xs:sequence>
        <xs:element ref="NAME"/>
    </xs:sequence>
    <xs:attributeGroup ref="attlist.ACTOR"/>
</xs:complexType>

<xs:complexType name="STAGEDIRTYPE" mixed="true">
    <xs:sequence>
        <xs:element minOccurs="1" maxOccurs="unbounded" ref="PERSONA"/>
    </xs:sequence>
    <xs:attributeGroup ref="attlist.SPEAKER"/>
</xs:complexType>

<xs:complexType name="SPEECHTYPE">
    <xs:sequence>
        <xs:element ref="SPEAKER"/>
        <xs:element ref="LINE"/>
    </xs:sequence>
    <xs:attributeGroup ref="attlist.ACTOR"/>
</xs:complexType>

<xs:complexType name="SPEAKERTYPE" mixed="true">
    <xs:attributeGroup ref="attlist.SPEAKER"/>
</xs:complexType>

<xs:element name="PLAY" type="PLAYTYPE"/>
<xs:element name="CAST" type="CASTTYPE"/>
<xs:element name="ROLE" type="ROLETYPE"/>
<xs:element name="ACTOR" type="ACTORTYPE"/>
<xs:element name="SCENE" type="SCENETYPE"/>
<xs:element name="STAGEDIR" type="STAGEDIRTYPE"/>
<xs:element name="SPEAKER" type="SPEAKERTYPE"/>
<xs:element name="TITLE" type="xs:string"/>
<xs:element name="PERSONA" type="xs:string"/>
<xs:element name="NAME" type="xs:string"/>
<xs:element name="LINE" type="xs:string"/>

<xs:attributeGroup name="attlist.ACTOR">
    <xs:attribute name="id" use="required"/>
</xs:attributeGroup>

<xs:attributeGroup name="attlist.SPEAKER">
    <xs:attribute name="actorID" use="required"/>
</xs:attributeGroup>

So how to achieve this kind of restriction in the schema?

helderdarocha
  • 23,209
  • 4
  • 50
  • 65
  • I think the schema definition is the wrong layer of logic to define this kind of business logic. It is not a constraint on what kind of data can be represented, but about what data makes sense when applied to the real world. – IMSoP May 24 '14 at 14:15
  • Do you mean that in one same `` there could not be more than one `` elements with the same `id` and different contents? That could be achieved with a XPath assertion, but you would have to upgrade to XSD 1.1. – helderdarocha May 24 '14 at 14:45
  • @helderdarocha Thanks for the reply. Multiple elements are allowed in one , what is not allowed is multiple elements with the same "actorId" between two elements. For example in the first stage-direction, actors enter the stage, Alex is playing both Superman and Batman, so Superman and Batman cannot appear in the same stage-direction. However in the second stage-direction, Superman exits the stage, now Batman can enter. I was checking out Assertion in XSD 1.1 last night, and I will give it a try to achieve this by using it. Thanks. – Yingdong Zhang May 25 '14 at 03:05
  • @IMSoP I would think so as well. But it's an actual problem I end up facing at the moment. I read your solution and some other examples which use Assertion and XPath to achieve similar task, I will try and see if this can be done using them. Thanks. – Yingdong Zhang May 25 '14 at 03:20
  • I see. I noted that the `` actually accepts `` objects. If those `` objects are the characters that participate in a scene, and if they could have an `actorID` attribute you could do the consistency check inside the ``, and save the trouble of having to check for `preceding::` nodes for the speakers in a scene. – helderdarocha May 25 '14 at 03:23
  • I made some comments about that, and some others at the end of the example I posted. – helderdarocha May 25 '14 at 03:24
  • @helderdarocha Hey sorry for the late reply. I came up with a solution based on your hints. I added ID to PERSONA as well, and in the SPEAKER I also added the personaID to it. Then I put all the speeches that are in between of two stage-directions into a DIALOG element. Then do an assertion in the DIALOG element: for a given actorID, select the distinct value of the personaID from all the SPEAKERs that have that actorID, and count it, the result should equal to 1. – Yingdong Zhang May 29 '14 at 15:22
  • @helderdarocha ` ` Thanks again for your help! – Yingdong Zhang May 29 '14 at 15:22

1 Answers1

0

To enforce the rule that for every one <SCENE> any <SPEAKER> elements with the same actorID attributes must contain the same text, you can declare your SCENETYPE with a XPath 2.0 assertion, as shown below:

<xs:complexType name="SCENETYPE">
    <xs:sequence>
        <xs:element ref="TITLE"/>
        <xs:element ref="STAGEDIR"/>
        <xs:element ref="SPEECH" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:assert test="every $speakerId in SPEECH/SPEAKER/@actorID
                       satisfies
                        (every $speaker in SPEECH/SPEAKER[@actorID=$speakerId]
                          satisfies 
                            (SPEECH/SPEAKER[@actorID=$speakerId])[1] = $speaker)"/>
</xs:complexType>

This will work in XSD 1.1, which supports <xs:assert>. You might also be able to do something like this using Schematron extensions for XSD 1.0.

With this restriction, this block will validate, since there is a consistent mapping: 1 = Superman, 2 = Hulk:

<SCENE>
    ...
    <SPEECH id="1">
        <SPEAKER actorID="1">Superman</SPEAKER>
        <LINE>Hahaha</LINE>
    </SPEECH>
    <SPEECH id="2">
        <SPEAKER actorID="2">Hulk</SPEAKER>
        <LINE>Hahaha</LINE>
    </SPEECH>
    <SPEECH id="3">
        <SPEAKER actorID="1">Superman</SPEAKER>
        <LINE>Yo</LINE>
    </SPEECH>
    <SPEECH id="4">
        <SPEAKER actorID="2">Hulk</SPEAKER>
        <LINE>Yo</LINE>
    </SPEECH>
</SCENE>

But this will fail validation, since 1 contains Superman and also Batman:

<SCENE>
    <TITLE>SCENE I</TITLE>
    <STAGEDIR actorID="0"><PERSONA>Superman</PERSONA><PERSONA>Hulk</PERSONA><PERSONA>Batman</PERSONA></STAGEDIR>
    <SPEECH id="1">
        <SPEAKER actorID="1">Superman</SPEAKER>
        <LINE>Hahaha</LINE>
    </SPEECH>
    <SPEECH id="2">
        <SPEAKER actorID="2">Hulk</SPEAKER>
        <LINE>Hahaha</LINE>
    </SPEECH>
    <SPEECH id="3">
        <SPEAKER actorID="1">Batman</SPEAKER>
        <LINE>Yo</LINE>
    </SPEECH>
    <SPEECH id="4">
        <SPEAKER actorID="2">Hulk</SPEAKER>
        <LINE>Yo</LINE>
    </SPEECH>
</SCENE>

If you are still designing this schema, a better option would be to check for this consistency in your <STAGEDIR> element, since it would be possible to have the same actor being two characters in the same scene if it exits and returns in a different moment. Since a <PERSONA> can be played by different actors, you might want to have some way to associate a currentActorId for example, and when you list the <PERSONA> elements in the <STAGEDIR> you can check for consistency there, instead of checking the entire scene.

helderdarocha
  • 23,209
  • 4
  • 50
  • 65