1

I am trying to detetct/work around this bug in RSS elements. That means I have to find a wrong namespace-declaration and change its value to the correct namespace. E.g:

xmlns:media="http://search.yahoo.com/mrss" 

must be:

xmlns:media="http://search.yahoo.com/mrss/" 

How can I achive that given a org.w3c.Document?

I meanwile found out how to get all elements of a certain namespace:

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xpath = xpf.newXPath();
        XPathExpression expr = xpath.compile("//*[namespace-uri()='http://search.yahoo.com/mrss']");


        Object result = expr.evaluate(d, XPathConstants.NODESET);
        if (result != null) {
            NodeList nodes = (NodeList) result;
            for(int node=0;node<nodes.getLength();node++)
            {
                Node n = nodes.item(node);
                this.log.warn("Found old mediaRSS namespace declaration: "+n.getTextContent());
            }

        } 

So now I have to figure out how to change the namespace of a Node via JAXP.

er4z0r
  • 4,711
  • 8
  • 42
  • 62

2 Answers2

1

You could probably do it with XSLT, with a rule like this:

<xsl:template match="media:*">
   <xsl:element name="local-name()" namespace="http://search.yahoo.com/mrss/">
      <xsl:apply-templates match="node()|@*"/>
   </xsl:element>
</xsl:template>

where media is bound to "http://search.yahoo.com/mrss".

You may have to tweak the syntax a little, as I'm writing this without the help of a compiler. Also, what you'll get is probably not extremely nicely formatted (namespace declarations on many elements), but it should be locically correct.

Chris Lercher
  • 37,264
  • 20
  • 99
  • 131
  • Thanks for you reply. However I am accessing the document on the object level. I am also not sure whether the local prefix will always be "media:". After all this are RSS-Feeds made by other people. God knows what prefix they use :-/ – er4z0r Mar 14 '10 at 22:13
  • They don't have to! You can use any prefix in the XSLT (e.g. "x:*"), all that matters is the namespace. (In other words, the prefix you use in XSLT doesn't have anything to do with the prefix in the XML file.) – Chris Lercher Mar 14 '10 at 22:19
  • @er4z0r - the namespace prefix that you declare in your XSLT (i.e. media) does not have to match the namespace prefix in the source document. As long as they both refer to the same URI, the template will match. – Mads Hansen Mar 14 '10 at 22:20
  • Just to see, if I got you right. Your XSLT would look for all elements that are prefixed with the prefix representing the "wrong" namespace" and then set the namespace of these directly to the correct namespace? – er4z0r Mar 14 '10 at 22:22
  • Yes, that's right. You could also try to match "xmlns" attributes, and change them (to get a nicer XML, if you care). But you'll have to change the elements anyway in addition to that. – Chris Lercher Mar 14 '10 at 22:27
  • BTW, there may be more performant ways than XSLT, if you're already starting with a DOM document - but you didn't ask that in your original version of the question, so I answered based on the assumption that you were working on XML files. – Chris Lercher Mar 14 '10 at 22:32
  • Why whould I have to change the elements anyway after changing the xmlns attribute? – er4z0r Mar 15 '10 at 00:12
  • Let's say, your input is: Now if you change the xmlns attribute to "other", you'll end up with: . This is, because every element node is associated with a namespace, and the element b is still in namespace "example". BTW, not all XML processors will even allow changing the xmlns attribute, but if they do, you will still have to change the element namespaces, because the recursive effect of xmlns attributes is mainly a cosmetic thing, to make human readable XML files look cleaner. – Chris Lercher Mar 15 '10 at 10:27
  • O.K. Maybe I really should have a look into XSLT and try that. – er4z0r Mar 15 '10 at 12:19
  • @chris_I: Sorry, I am trying to solve this via the DOM. Can you tell me what the equivalent XPATH-Selector would be for your code above? – er4z0r Mar 19 '10 at 13:32
0

Just for the sake of completeness:

Java Code:

Document d = out.outputW3CDom(converted);
            DOMSource oldDocument = new DOMSource(d);
            DOMResult newDocument = new DOMResult();
            TransformerFactory tf = TransformerFactory.newInstance();
            StreamSource xsltsource = new StreamSource(
                    getStream(MEDIA_RSS_TRANSFORM_XSL));
            Transformer transformer = tf.newTransformer(xsltsource);
            transformer.transform(oldDocument, newDocument);

private InputStream getStream(String fileName) {
    InputStream xslStream = Thread.currentThread().getContextClassLoader()
                .getResourceAsStream("/" + fileName);
    if (xslStream == null) {
        xslStream = Thread.currentThread().getContextClassLoader()      .getResourceAsStream(fileName);
        }
        return xslStream;
    }

Stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <!--identity transform that will copy matched node/attribute to the output and apply templates for it's children and attached attributes-->
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="@*|*|text()" />
        </xsl:copy>
    </xsl:template>

    <!--Specialized template to match on elements with the incorrect namespace and generate a new element-->
    <xsl:template match="//*[namespace-uri()='http://search.yahoo.com/mrss']">
        <xsl:element name="{local-name()}" namespace="http://search.yahoo.com/mrss/" >
            <xsl:apply-templates select="@*|*|text()" />
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

Special thanks to Mads Hansen for his help with the XSLT.

Community
  • 1
  • 1
er4z0r
  • 4,711
  • 8
  • 42
  • 62