0

I have a WordPress site with custom taxonomies. I send newsletters automatically with Mailchimp for each taxonomy feed. Most feeds work, but those for which there is a quote in the title are invalid.

For example, you can see this feed which title is "Val d'Oise" is invalid : https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwww.verdi-immobilier.com%2Fdepartements%2F95-val-doise%2Ffeed%2F.

It returns the error XML parsing error: <unknown>:11:24: undefined entity. After testing, it's actually the quote which causes problem.

Here is the feed:

    <?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
    xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:atom="http://www.w3.org/2005/Atom"
    xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    >

<channel>
    <title>95 &#8211; Val-d&rsquo;oise &#8211; Verdi Immo</title>
    <atom:link href="https://www.verdi-immobilier.com/departements/95-val-doise/feed/" rel="self" type="application/rss+xml" />
    <link>https://www.verdi-immobilier.com</link>
    <description>Le dernier recours des propriétaires</description>
    <lastBuildDate>2019-11-01 06:24:28</lastBuildDate>
    <language>fr-FR</language>
    <sy:updatePeriod>
    hourly  </sy:updatePeriod>
    <sy:updateFrequency>
    1   </sy:updateFrequency>
    
<image>
    <url>https://www.verdi-immobilier.com/wp-content/uploads/2019/09/cropped-logo-ico-32x32.png</url>
    <title>95 &#8211; Val-d&rsquo;oise &#8211; Verdi Immo</title>
    <link>https://www.verdi-immobilier.com</link>
    <width>32</width>
    <height>32</height>
</image> 
</channel>
</rss>

The ’ does not seem to be interpreted. Do you guys know how to fix it?

Jason Aller
  • 3,541
  • 28
  • 38
  • 38

2 Answers2

1

Wrong answer: This is not a quote:

&#8211;

It´s converted to a dash by wordpress

https://en.wikipedia.org/wiki/Dash

And a dash is not an UTF-8 character. Try this encoding:

<?xml version="1.0" encoding="UTF-16"?>

Edit: Right Answer: You are right, the problem ist the ’ - which is invalid.

Can you try to replace the ’ in the title of your post to

&#8217; 

(which is valid and the same character) On the Frontend the ’ is shown and i hope a valid encoded character in the xml output also.

replace:

Val-d’oise

with:

Val-d&#8217;oise

in the post-title.

It is dirty, but I hope this helps. I think WordPress had a similar bug years ago.

Regards Tom

Uklove
  • 509
  • 3
  • 9
0

I don't think it's the &#8211; that's causing the problem - it's the &rsquo; You can test this by pasting your XML text into the XML validator at https://xmlvalidation.com/

Substituting &#8217; for the &rsquo; seems to pass the test on that validator page.

I don't know how to make your Wordpress feed emit the &#8217; That's your next project. Good luck!

richb-hanover
  • 1,015
  • 2
  • 12
  • 23