2

What is the right way to trim characters in foundCharacters of NSXMLParser?

I've found many examples that do it like this:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
        NSString *characters = [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        // Append to previous characters of the same element
}

Yet a space followed by a non-ASCII character is trimmed when I run the above. Example: "this é" will become "thisé".

hpique
  • 119,096
  • 131
  • 338
  • 476

2 Answers2

2

Yet a space followed by a non-ASCII character is trimmed when I run the above. Example: "this é" will become "thisé".

maybe by coincidence.

See the discussion of parser:foundCharacters:

The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.

And if the you trim the whitespace from this (with a space at the end you can't see) and then from é you will end up with thisé.

I've never used NSXMLParser, but maybe you should trim the string in parser:didEndElement:namespaceURI:qualifiedName:. From what I understand, the element will be completed then.

But I'm just guessing.

Matthias Bauch
  • 89,811
  • 20
  • 225
  • 247
  • I can confirm the explanation of the problem -- the parser:foundCharacters: method might run several times while parsing one element -- and the suggested solution. In my case I was losing the spaces around an HTML entity in the middle of an element, so "This & That" became "This&That". I moved the trimming to parser:didEndElement:namespaceURI:qualifiedName:. That method doesn't give direct access to the element string, but you can create an instance variable for it, update the variable from parser:foundCharacters: and access it in parser:didEndElement:namespaceURI:qualifiedName:. – arlomedia Mar 08 '13 at 00:34
0

This is what I've used in the past to trim whitespace without any problems.

- (NSString *)trim:(NSString *)inStr {
    return [inStr stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];

}

The only difference is that It's using a slightly different whitespace character set.

Jack Cox
  • 3,290
  • 24
  • 25