0

Nokogiri's versions 1.10.5 and later produces different validation errors to version 1.10.4 against the ONIX v2.1 and v3 XSDs, and I can't find any commits in that release and the changed dependencies (libxml2-2.9.10 and libxslt 1.1.34) which would cause such a change in behaviour.

I have created a small example of invalid ONIX XML (ff is not an allowed list entry in <ProductForm>ff</ProductForm>):

XSD_FILE = 'reference/ONIX_BookProduct_Release2.1_reference.xsd'.freeze

VALIDATION_NODE = <<-END_XML.freeze
  <?xml version=\"1.0\" encoding=\"UTF-8\"?>
  <ONIXMessage xmlns=\"http://www.editeur.org/onix/2.1/reference\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"http://www.editeur.org/onix/2.1/reference #{XSD_FILE}\">
    <Header>
      <FromCompany>Company</FromCompany>
      <ToCompany>OtherCompany</ToCompany>
      <SentDate>20140912</SentDate>
    </Header>
    <Product>
      <RecordReference>9780954575915</RecordReference>
      <NotificationType>03</NotificationType>
      <ProductIdentifier>
        <!-- GTIN-13 -->
        <ProductIDType>03</ProductIDType>
        <IDValue>9780954575915</IDValue>
      </ProductIdentifier>
      <ProductIdentifier>
        <!-- ISBN -->
        <ProductIDType>15</ProductIDType>
        <IDValue>9780954575915</IDValue>
      </ProductIdentifier>
      <ProductIdentifier>
        <ProductIDType>01</ProductIDType>
        <IDTypeName>Publisher's Reference</IDTypeName>
        <IDValue>publishers_reference</IDValue>
      </ProductIdentifier>
      <ProductForm>ff</ProductForm>
      <ProductFormDetail>B401</ProductFormDetail>
    </Product>
  </ONIXMessage>
END_XML

And created a test for it which uses the ONIX XSD:

gem 'nokogiri', '1.10.4'
require 'nokogiri'
require 'minitest/autorun'
require_relative '../reference/constants.rb'

class Test < MiniTest::Spec
  describe "XSD validation of invalid ONIX 2.1 XML using Nokogiri 1.10.4" do
    it "should produce validation errors" do
      product = Nokogiri::XML(VALIDATION_NODE)
      errors = Nokogiri::XML::Schema(File.open(XSD_FILE)).
               validate(product).
               map(&:to_s)

      # declutter the errors
      errors.map! {|e| e.gsub("{http://www.editeur.org/onix/2.1/reference}", "")}

      assert_equal [
        "26:0: ERROR: Element 'ProductForm': [facet 'enumeration'] The value 'ff' is not an element of the set {'00', 'AA', 'AB', 'AC', 'AD', 'AE', 'AF', 'AG', 'AH', 'AI', 'AJ', 'AK', 'AL', 'AZ', 'BA', 'BB', 'BC', 'BD', 'BE', 'BF', 'BG', 'BH', 'BI', 'BJ', 'BK', 'BL', 'BM', 'BN', 'BO', 'BP', 'BZ', 'CA', 'CB', 'CC', 'CD', 'CE', 'CZ', 'DA', 'DB', 'DC', 'DD', 'DE', 'DF', 'DG', 'DH', 'DI', 'DJ', 'DK', 'DL', 'DM', 'DN', 'DO', 'DZ', 'FA', 'FB', 'FC', 'FD', 'FE', 'FF', 'FZ', 'MA', 'MB', 'MC', 'MZ', 'PA', 'PB', 'PC', 'PD', 'PE', 'PF', 'PG', 'PH', 'PI', 'PJ', 'PK', 'PL', 'PM', 'PN', 'PO', 'PP', 'PQ', 'PR', 'PS', 'PT', 'PZ', 'VA', 'VB', 'VC', 'VD', 'VE', 'VF', 'VG', 'VH', 'VI', 'VJ', 'VK', 'VL', 'VM', 'VN', 'VO', 'VP', 'VZ', 'WW', 'WX', 'XA', 'XB', 'XC', 'XD', 'XE', 'XF', 'XG', 'XH', 'XI', 'XJ', 'XK', 'XL', 'XM', 'XZ', 'ZA', 'ZB', 'ZC', 'ZD', 'ZE', 'ZF', 'ZG', 'ZH', 'ZI', 'ZJ', 'ZY', 'ZZ'}.",
        "26:0: ERROR: Element 'ProductForm': 'ff' is not a valid value of the atomic type 'List7'.",
        "8:0: ERROR: Element 'Product': Missing child element(s). Expected is one of ( ProductFormDetail, ProductFormFeature, BookFormDetail, ProductPackaging, ProductFormDescription, NumberOfPieces, TradeCategory, ProductContentType, ContainedItem, ProductClassification ).",
      ],
      errors
    end
  end
end

The test passes when specifying version 1.10.4, and fails in 1.10.5 and above because it omits to return the following error:

"26:0: ERROR: Element 'ProductForm': 'ff' is not a valid value of the atomic type 'List7'."

Here's a zip file with the test fixtures, on the issue raised in the gem repo, which you can run with $ ruby -Ilib:test 1.10.4/test_case.rb and $ ruby -Ilib:test 1.10.5/test_case.rb

snowangel
  • 3,452
  • 3
  • 29
  • 72
  • You might want to ask this on the Nokogiri mail-list or the IRC channel. – the Tin Man Jan 05 '20 at 20:10
  • Is this really a false positive or just a different set of error messages? This might also be related to a change in libxml2. Which version do you use exactly? The one bundled with Nokogiri? – nwellnhof Jan 06 '20 at 11:29
  • @nwellnhof well, 1.10.4 produces 3 errors, and 1.10.5 produces 2 errors, which are identical to two of the three original errors. So I *think* it's a false positive... unless I'm using the term wrong. Yes, I'm using the versions of libxml2 and libxslt bundled with Nokogiri (linked to in my post) not least because ultimately we're deploying to Heroku so I want to fiddle with binaries as little as possible. I bet it is to do with libxml2 changes as their version bump that caused the Nokogiri version bump has a huge number of changes. But sadly I'm unversed in C... – snowangel Jan 06 '20 at 12:59

0 Answers0