0

I have a html content something likes this.

<body>
    <div>
        WINDOW<br/>
        DOOR<br/>
    </div>
</body>

I want to extract the text content in div tag.For this sample, I would like to get text WINDOW\nDOOR.

So I wrote code below.

NSString *html = ...;
TFHpple *parser = [[TFHpple alloc]initWithHTMLData:[html dataUsingEncoding:NSUTF8StringEncoding]];
TFHppleElement *div = [parser searchWithXPathQuery:@"//div"][0];
NSString *text = [div text];

It works not as I expected. The text result above code is WINDOW only. DOOR is missing anyway.

And then I struggled a lot and wrote a little more code.

NSString *html = ...;
TFHpple *parser = [[TFHpple alloc]initWithHTMLData:[html dataUsingEncoding:NSUTF8StringEncoding]];
TFHppleElement *div = [parser searchWithXPathQuery:@"//div"][0];
NSString *text = [div raw];
text = [self stringByStrippingHTML:text];

I got the raw html content and then strip all html tags to get the result as I expected. But this method seems like a little ugly.

So, my question is, is there a method exist to get all text content within a html tag?

Thanks for your help.

Cao Dongping
  • 969
  • 1
  • 12
  • 29

1 Answers1

0

try this : https://github.com/topfunky/hpple

Hpple: A nice Objective-C wrapper on the XPathQuery library for parsing HTML.

Apple Kid
  • 623
  • 2
  • 7
  • 16