How to match all namespace's element by Nokogiri

Question

I want to find a element like this.

xml1 = '<period>2017-02-10</period>'

or

xml2 = <<XML
<root xmlns:xbrli="http://www.w3.org/1999/xhtml">
  <xbrli:period>2017-02-10</period>
</root>
XML

I can select the element by:

  def period_from_xml(xml)
    doc = Nokogiri::XML(xml)
    period_element = if doc.namespaces.keys.include?('xmlns:xbrli')
      doc.at_css("xbrli|period")
    else
      doc.at_css("period")
    end
  end

  period_from_xml(xml1)
  # => <period>2017-02-10</period>
  period_from_xml(xml2)
  # => <xbrli:period>2017-02-10</period>

I know Nokogiri::XML::Document#remove_namespaces!, but I don't want to use it, because another place I need it.

Maybe duplicating the doc and doc_without_namespaces is good idea?

Is there a easy and simple way to handle this situation?

Please read "[mcve]". Your input XML sample needs to be better as it's missing the namespace declarations you're trying to find. — the Tin Man, Feb 13 '17 at 22:17

score 0 · Answer 1 · edited May 23 '17 at 11:54

0

I'd use this:

require 'nokogiri'

xml = <<EOT
<root xmlns:xbrli="http://www.w3.org/1999/xhtml">
  <period>2017-02-10</period>
  <xbrli:period>2017-02-11</period>
</root>
EOT

doc = Nokogiri::XML(xml)

doc.search('period,xbrli|period').map(&:text) # => ["2017-02-10", "2017-02-11"]

'period,xbrli|period' in CSS means "find "period" or "xbrli:period".

See "How to avoid joining all text from Nodes when scraping" also.

edited May 23 '17 at 11:54

Community

1
1

answered Feb 13 '17 at 22:24

the Tin Man

158,662
42
215
303

Sorry, my first question had not enough information about what I want to know. I edited my question. – ironsand Feb 17 '17 at 03:26

How to match all namespace's element by Nokogiri

1 Answers1