2

I have a PMML generated from SAS Miner that I can't get properly evaluated using JPMML 1.1.4. JPMML 1.1.4 says it supports PMML 4.2 and the PMML says it is PMML version 4.2.

Is the FMTWIDTH in the below function "SAS-EM-String-Normalize" proper PMML syntax?

Any ideas why I can't evaluation this function using JPMML?

I have the function in my TransformationDictionary that looks like,

<TransformationDictionary>
    <DefineFunction name="SAS-EM-String-Normalize" optype="categorical" dataType="string">
        <ParameterField name="FMTWIDTH" optype="continuous"/>
        <ParameterField name="AnyCInput" optype="categorical"/>
        <Apply function="trimBlanks">
          <Apply function="uppercase">
            <Apply function="substring">
              <FieldRef field="AnyCInput"/>
              <Constant>1</Constant>
              <Constant>FMTWIDTH</Constant>
            </Apply>
          </Apply>
        </Apply>   
    </DefineFunction>
</TransformationDictionary>

And I get the following exception,

Exception in thread "main" org.jpmml.evaluator.TypeCheckException: Expected INTEGER, but got STRING (FMTWIDTH) at org.jpmml.evaluator.FieldValue.asInteger(FieldValue.java:125) at org.jpmml.evaluator.FunctionRegistry$36.evaluate(FunctionRegistry.java:463) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:38) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:203) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:76) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:43) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:203) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:188) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:58) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:45) at org.jpmml.evaluator.ExpressionUtil.evaluateMapValues(ExpressionUtil.java:169) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:87) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:58) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:45) at org.jpmml.evaluator.RegressionModelEvaluator.evaluateRegressionTable(RegressionModelEvaluator.java:150) at org.jpmml.evaluator.RegressionModelEvaluator.evaluateClassification(RegressionModelEvaluator.java:107) at org.jpmml.evaluator.RegressionModelEvaluator.evaluate(RegressionModelEvaluator.java:57) at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:65) at ValidPMMLTesterRandomScores.randomEvaluation(ValidPMMLTesterRandomScores.java:116) at ValidPMMLTesterRandomScores.printModelInformation(ValidPMMLTesterRandomScores.java:94) at ValidPMMLTesterRandomScores.readModelFromFile(ValidPMMLTesterRandomScores.java:142) at ValidPMMLTesterRandomScores.main(ValidPMMLTesterRandomScores.java:160)

pettinato
  • 1,472
  • 2
  • 19
  • 39
  • You should update JPMML-Evaluator dependency to the latest stable version, which is 1.2.5. – user1808924 Oct 09 '15 at 15:40
  • As part of further debugging, I tried to upgrade to version 1.2.5, but it did not help this specific issue. – pettinato Oct 12 '15 at 21:52
  • If nothing else, then the upgraded version will produce a much more relevant exception stack trace. JPMML-Evaluator version 1.1.4 dates back to May 2014, and many things have been moved around/modified since then. – user1808924 Oct 13 '15 at 08:42
  • The upgrade of PMML consumer software does not help if the PMML document itself is invalid. However, you can use Java PMML APIs, most notably the Visitor API of the JPMML-Model library to perform on-the-fly correction of PMML documents programmatically. The pattern of this "SAS-EM-String-Normalize" problem is very easy to identify and correct (takes around ten to fifteen lines of Java code to create a reusable component). – user1808924 Oct 13 '15 at 08:50

2 Answers2

3

According to the formal definition of the PMML built-in function "substring", it requires a string argument and two integer arguments. The SAS EM generated PMML code attempts to invoke this function with a string argument, an integer argument, and another string argument substring($AnyCInput, 1, "FMTWIDTH").

This PMML fragment can be fixed by accessing the value of the "FMTWIDTH" parameter using the FieldRef element:

<Apply function="substring">
  <FieldRef field="AnyCInput"/>
  <Constant>1</Constant>
  <FieldRef field="FMTWIDTH"/>
</Apply>

In conclusion, JPMML is a correct and SAS EM is wrong.

user1808924
  • 4,563
  • 2
  • 17
  • 20
0

Invalid PMML documents can be corrected on the fly by rearranging the PMML class model object. The Visitor API of the JPMML-Model library is designed exactly for this purpose:

PMML pmml = loadSasEmPMML()

Visitor invalidSubstringCorrector = new AbstractVisitor(){

    @Override
    public VisitorAction visit(Apply apply){
        if(isInvalidSubstring(apply)){
            List<Expression> expressions = apply.getExpressions();

            expressions.set(2, new FieldRef(new FieldName("FMTWIDTH")));
        }
        return super.visit(apply);
    }

    private boolean isInvalidSubstring(Apply apply){
        if(("substring").equals(apply.getFunction())){
            List<Expression> expressions = apply.getExpressions();

            Expression lengthArgument = expressions.get(2);
            if(lengthArgument instanceof Constant){
                Constant constant = (Constant)lengthArgument;
                return ("FMTWIDTH").equals(constant.getValue());
            }
        }
        return false;
    }
};

invalidSubstringCorrector.applyTo(pmml);

Currently, the method isInvalidSubstring(Apply) identifies problematic Apply elements by checking only if the third expression element is a String constant "FMTWIDTH". If one needs to be extra sure, then perhaps it would be a good idea to add proper assertions about the first and second expression element as well.

user1808924
  • 4,563
  • 2
  • 17
  • 20