-1

Just a simple question, I'm stuck at a scenario where I want to pass multiple information other than the pipeline itself inside a PMML file.

Other information like:

  • Average of all columns in dataset avg(col1), ... abg(coln)
  • P values of all features.
  • Correlation of all features with target.

There can be more of those, but the situation is like this as you can see. I know they can be easily sent with other file specifically made for it, but since it is regarding the ML model, I want them in the single file: PMML.

The Question is:

Can we add any additional information in PMML file that is extra in nature and might not be related with the model so that it can be used on the another side?

If that becomes possible, somehow, it would be much more helpful.

JM Gelilio
  • 3,482
  • 1
  • 11
  • 23
Aayush Shah
  • 381
  • 2
  • 11
  • I have found this: https://openscoring.io/blog/2015/05/15/jpmml_model_api_vendor_extensions/ where the workaround seems to be adding data with tag in Java. There might be other better solutions which are most welcomed!! – Aayush Shah Jun 20 '22 at 13:56

1 Answers1

1

The PMML standard is extensible with custom elements. However, in the current case, all your needs appear to be served by existing PMML elements.

  • Average of all columns in dataset avg(col1), ... abg(coln)
  • P values of all features

You can store descriptive statistics about features using the ModelStats element.

  • Correlation of all features with target

You can store function information using the ModelExplanation element.

user1808924
  • 4,563
  • 2
  • 17
  • 20
  • Luckily things that I want to store are available. But as said I might want more things to add - I would appreciate if you could provide some example to add custom elements in PMML. And I think there is one element called "Extension" does that play any role in adding custom things? I am pretty unsure if you could provide a little code example? Thank you for the response user! – Aayush Shah Jun 21 '22 at 06:15