2

I am trying to write a JAPE grammar to link an Anatomy annotation (tagged with a gazetteer list) with a Numeric value (tagged with an earlier phase) and a set of Units (tagged with a gazetteer list). I want to block certain patterns using pure negation on specific other annotations.

Here is my JAPE grammar:

Phase: Test
Input: Anatomy Numeric Units Lookup
Options: control=Appelt negationGrouping=false


Rule: test_1
(

    ( {Anatomy.minorType=="sinus_of_valsalva"} ):context

//Block matching here, allow up to 2 annotations to appear between context and value that are not a Lookup annotation of type 'body_surface_area' OR a Units annotation of type 'cm'.
    ( {!Lookup.minorType=="body_surface_area",
       !Units.minorType=="cm"} )[0,2]

    ( {Numeric} ):value 

    ( {Units.minorType=="cm"} ):unit

):test
--> 
:test
{
    gate.AnnotationSet matchedVar    =(gate.AnnotationSet) bindings.get("value");
    gate.AnnotationSet matchedcontext=(gate.AnnotationSet) bindings.get("context");
    gate.AnnotationSet matchedunit   =(gate.AnnotationSet) bindings.get("unit");
    gate.AnnotationSet matchedAnns   =(gate.AnnotationSet) bindings.get("test");    
    gate.FeatureMap newFeatures      = Factory.newFeatureMap();
    newFeatures.put("vartype","test");
    newFeatures.put("varValue", stringFor(doc, matchedVar));
    newFeatures.put("context", stringFor(doc, matchedcontext));
    if(matchedunit != null) {
        newFeatures.put("unit", stringFor(doc, matchedunit));
    }else {
        newFeatures.put("unit", null);
    }
    outputAS.add(matchedAnns.firstNode(),matchedAnns.lastNode(),"test", newFeatures);
}

I can't seem to get this to work with this very simple examples:

Text: SoV 75 cm

expected annotation --> {vartype=test, context=SoV, value=75, unit=cm}, works

Text: SoV other unimportant text 75 cm

expected annotation --> {vartype=test, context=SoV, value=75, unit=cm}, works

Text: SoV body surface area 75 cm

expected annotation --> blocked no match, works

Text: SoV is 75 cm

expected annotation --> no match, why doesn't this match??

Gate Developer GUI

My understanding is that pure negation {!Annotation X} is equivalent to matching any input annotation with the negation criteria, see here:

For reference the rule you have now which has only {!Lookup} constraints probably doesn't do what you think it does. Any {...} element that has only negative constraints is roughly equivalent to: ({A, !Lookup.....} | {B, !Lookup.....} | ....)

I suspect I am missing something fundamental here, can anyone enlighten me?

Full reproducible example all in one JAPE, without the need for any gazetteers:

Phase: AnatomyPhase
Input: Token
Options: control=Appelt  negationGrouping=false

Rule: anatomy
(
    {Token.string == "SoV"} 
    
):anatomy
-->
:anatomy.Anatomy = {majorType = "aorta", minorType="sinus_of_valsalva", language="en"}


Phase: UnitsPhase
Input: Token
Options: control=Appelt  negationGrouping=false

Rule: units
(
    {Token.string == "cm"} 
    
):units
-->
:units.Units = {majorType = "length", minorType="cm", language="en"}

Rule: units_2
(
    {Token.string == "is"} 
    
):units
-->
:units.Units = {majorType = "assertion", minorType="positive_assertion", language="en"}


Phase: LookupPhase
Input: Token
Options: control=Appelt  negationGrouping=false

Rule: lookup
Priority: 100
(
    {Token.string == "body"}{Token.string == "surface"}{Token.string == "area"}
    
):lookup
-->
:lookup.Lookup = {majorType = "body_surface_area", minorType="body_surface_area", language="en"}


Phase: TagNumeric
Input: Token SpaceToken Split Lookup
Options: control=Appelt  negationGrouping=false

Rule: double_tagger
Priority: 100
(
    {Token.kind == "number", Token notWithin Lookup} 
    ({SpaceToken, !Split})[0,1] 
    {Token.string ==~ "[.,]"} 
    {Token.kind == "number", Token notWithin Lookup}
    
):double_tagger
-->
:double_tagger.Numeric = {type = "double"}

Rule: int_tagger
Priority: 99
(
    {Token.kind == "number", Token notWithin Lookup}
):int_tagger
-->
:int_tagger.Numeric = {type = "integer"}



Phase: Test
Input: Lookup Numeric Units Anatomy
Options: control=Appelt negationGrouping=false


Rule: test_1
(
    ( {Anatomy.minorType=="sinus_of_valsalva"} ):context
    ( {!Lookup.minorType=="body_surface_area",
       !Units.minorType=="cm"} )[0,2]
    ( {Numeric} ):value 
    ( {Units.minorType=="cm"} ):unit

):test_var 
--> 
:test_var
{
    gate.AnnotationSet matchedVar    =(gate.AnnotationSet) bindings.get("value");
    gate.AnnotationSet matchedcontext=(gate.AnnotationSet) bindings.get("context");
    gate.AnnotationSet matchedunit   =(gate.AnnotationSet) bindings.get("unit");
    gate.AnnotationSet matchedAnns   =(gate.AnnotationSet) bindings.get("test_var");    
    gate.FeatureMap newFeatures      = Factory.newFeatureMap();
    newFeatures.put("varType","test_var");
    newFeatures.put("varValue", stringFor(doc, matchedVar));
    newFeatures.put("context", stringFor(doc, matchedcontext));
    if(matchedunit != null) {
        newFeatures.put("unit", stringFor(doc, matchedunit));
    }else {
        newFeatures.put("unit", null);
    }
    outputAS.add(matchedAnns.firstNode(),matchedAnns.lastNode(),"test", newFeatures);
}

  • 1
    Sorry, but I cannot reproduce your example. For text "SoV is 75 cm", I obtained annotation `test {context=SoV, unit=cm, varType=test_var, varValue=75}`. – dedek Jun 30 '22 at 06:44
  • 1
    https://imgur.com/a/ITN5WVq – dedek Jun 30 '22 at 06:50

1 Answers1

2

It looks like the problem I was having is something to do with the JAPE Plus Transducer I was using, as the regular Jape Transducer gives the expected result, as dedek kindly showed using my example code.

Instead of using only negative conditions I settled on this approach with does the same thing and works with JAPE Plus, only a bit more cumbersome:

(
 {Lookup, 
    !Lookup.minorType=="body_surface_area"} |
 {Units, 
    !Units.minorType=="cm"}
)[0,2]

Leaving out an input annotation from this block has the effect of automatically blocking that type.

Or you can explicitly block all of a particular type doing something like (where 'BLOCK_ALL' never exists):

(
 {Lookup, 
   !Lookup.minorType=="body_surface_area"} |
 {Units, 
   !Units.minorType=="cm"} |
 {Numeric, 
   !Numeric.type != "BLOCK_ALL"} |
 {Anatomy, 
   !Anatomy.majorType != "BLOCK_ALL"} 
)[0,2]

Or explicitly allow any of a certain types:

   (
     {Lookup, 
       !Lookup.minorType=="body_surface_area"} |
     {Units, 
       !Units.minorType=="cm"} |
     {Numeric} |
     {Anatomy} 
    )[0,2]

I'd be interested to know the reason behind this.

  • Negation is one area in which there are behaviour differences between regular JAPE and JAPE_Plus; for `{A, !B}` JAPE_Plus requires _both_ `A` and `B` to be in the `Input` line, whereas regular JAPE only requires `A` (but will still check for "not B"). But I don't know how JAPE_Plus handles pattern elements that _only_ have a negation, sorry. – Ian Roberts Jul 04 '22 at 09:37