2

I have been trying to parse this ( http://app.calvaryccm.com/mobile/android/v1/devos) URL using a SAX parser found here: http://android-er.blogspot.com/2010/05/simple-rss-reader-iii-show-details-once.html I have been working on how to handle the description tag within the XML. I have tried this with and without the CDATA tag and nothing seems to help. It's almost as if the link is being read into the description.

The first part works just fine:

enter image description here

The problem happens when I try to access the inner page. It's almost as if the link tag is getting read before the description tag is.

enter image description here

I am having an issue in getting the description tag to display right. Thank you for your help!

EDIT the full source code for this application is here: http://dl.dropbox.com/u/19136502/CCM.zip

Courtney Stephenson
  • 912
  • 2
  • 18
  • 43

1 Answers1

3

Ouch, after about 3 hours digging and analyzing your source code, I've found the reason why you have such a weird result like above.

First look at the RSS content from the link you parse: http://app.calvaryccm.com/mobile/android/v1/devos

Some parts of its content:

<?xml version="1.0" encoding="utf-8"?> <rss version="2.0"> <channel> <title>CCM Daily Devotions</title> <link>http://www.calvaryccm.com/resources/dailydevotions.aspx</link> <description>Calvary Chapel Melbourne's Daily Devotionals</description> <webMaster>webmaster@calvaryccm.com (Calvary Chapel Melbourne)</webMaster> <copyright>(c)2011, Calvary Chapel Melbourne. All rights reserved</copyright> <ttl>60</ttl> <item> <guid isPermaLink="false">b3e91cbf-bbe9-4667-bf4c-8ff831ba09f1</guid>
<title>Teachable Moments</title> <description>Based on &amp;ldquo;Role Models, Part 4&amp;rdquo; by Pastor Mark Balmer; 10/8-9/11, Message #6078; Daily Devotional #6 - &amp;ldquo;Teachable Moments&amp;rdquo; Preparing the Soil (Introduction): My husband and I took seriously our understanding of God&amp;rsquo;s instructions to teach His commandments to our children. (Deuteronomy 6:7) We went to our local Christian bookstore and bought children&amp;rsquo;s Bibles, studies, coloring books, games&amp;mdash;anything that would help us to communicate biblical situations in their lives. Planting and Watering the Seed (Growth): Each parent needs to take seriously God&amp;rsquo;s commthe Crop (Action/Response): Life is God&amp;rsquo;s classroom for teachable moments. A long delay in traffic can be a frustrating irritation, or it can be an opportunity to teach our children that God&amp;rsquo;s than taught. Cultivating (Additional Reading): Psalm 78:1-8;&amp;nbsp;Psalm 145:4 klw Calvary Chapel of Melbourne; 2955 Minton Road; W. Melbourne, FL 32904; 321-952-9673 NLT = New Living Translation. </description> <link>http://www.calvaryccm.com/resources/dailydevotions.aspx</link> <pubDate>Sun, 16 Oct 2011 12:00:00 GMT</pubDate> </item>

Pay attention closely to this tag /rss/channel/item/description, what you can see are these things: rsquo; or 'squo; or &amp; or ldquo; or rdquo; ... Those are escaped characters (Left Single Quote, Right Single Quote, Ampersand, Right Double Quote, Left Double Quote...even New Line), they are residing in XML content.

So when the XML Parser walk through these characters, it thinks about to escape parsing, which leads to weird result as you are facing right now.

What about solution? At first, I can think of getting the content of the URL first, then unescape those characters (adding SLASH characters), now I think you can parse it again with success.
This solution seems to work well, however, I think it might not, because the RSS text content response from server is in really weird format (not well-formatted). So if you can contact to this web administrator, tell them to format RSS content nicely (like adding SLASH to escape characters, remove all NEW-LINE characters...) before issuing the RSS subscription.

The other solutions is to use some third-party that handle escaping/unescaping stuffs like StringEscapeUtils from Apache Commons: http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/StringEscapeUtils.html or JTidy.
But I don't think these libraries work best in your case.

That's all I can tell.

@p/s: just some comments to your source code, I think you need to think about make your code clear to read, better for maintenance, and re-package appropriately.

Pete Houston
  • 14,931
  • 6
  • 47
  • 60
  • Check out my improved RSS feed, that is what the problem was [http://app.calvaryccm.com/mobile/android/v1/devos](http://app.calvaryccm.com/mobile/android/v1/devos) – Courtney Stephenson Oct 26 '11 at 23:38