1

I have content:encoded text like below from an rss:

<content:encoded><![CDATA[<P><B>Wednesday, September 26, 2012</B></P>It is Apple.<P>Shops are closed.<br />Parking is not allowed here. Go left and park.<br />All theatres are opened.<br /></P><P><B>Thursday, September 27, 2012</B></P><P>Shops are open.<br />Parking is not allowed here. Go left and park.<br  />All theatres are opened.<br /></P>]]></content:encoded>

Using the below method I am able to extract the text from the HTML:

public static string StripHTML(this string htmlText)
    {
        var reg = new Regex("<[^>]+>", RegexOptions.IgnoreCase);
        return HttpUtility.HtmlDecode(reg.Replace(htmlText, string.Empty));
    }

But I want the text within <b></b> to be inserted in a dateArray[] and text within <p></p> to be inserted in descriptionArray[] so the I can display like below: enter image description here

tHANKS iN aDVANCE.

Shan
  • 435
  • 6
  • 19
  • 4
    c#... you have some good html parsers (agilitypack for instance). [This](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) is what is said about regexes to parse html in stack overflow. Have fun – Gabber Oct 02 '12 at 20:22

1 Answers1

0
//http://htmlagilitypack.codeplex.com/
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var result = doc.DocumentNode.Descendants()
                .Where(n => n is HtmlAgilityPack.HtmlTextNode)
                .Select(n=>new {
                    IsDate = n.ParentNode.Name=="b" ? true: false,
                    Text = n.InnerText,
                })
                .ToList();
L.B
  • 114,136
  • 19
  • 178
  • 224
  • This error is showing "The type 'System.Xml.XPath.IXPathNavigable' is defined in an assembly that is not referenced. You must add a reference to assembly 'System.Xml.XPath, Version=2.0.5.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'" – Shan Oct 03 '12 at 06:09
  • I got it. I have to add reference to System.Xml.XPath from SDK folder. – Shan Oct 03 '12 at 06:54