0

I have a homework needs to read some rss feeds and build user profile etc.

My problem is when i use XMLParser from foundation, I will encounter "The operation couldn’t be completed. (NSXMLParserErrorDomain error 9.)"

I checked documentation and it seems that I have the invalidCharacterError. I don't think my code have problem since it works well for another url feeds. So what should i do to overcome this problem?

Here is url: http://halley.exp.sis.pitt.edu/comet/utils/_rss.jsp?v=bookmark&user_id=3600

P.S. this feeds contains CDATA so i comment out title and description but it should display date, but it is still show that error. So my concern is that during parsing the xml, it encountered any invalid character and report the error. Anyway to fix it? I have to use this url though.

and some related code are here:

func parseFeed(url: String, completionHandler: (([RSSItem]) -> Void)?)
{
    self.parserCompletionHandler = completionHandler

    let request = URLRequest(url: URL(string: url)!)
    let urlSession = URLSession.shared
    let task = urlSession.dataTask(with: request) { (data, response, error) in
        guard let data = data else {
            if let error = error {
                print(error.localizedDescription)
            }

            return
        }

        /// parse our xml data
        let parser = XMLParser(data: data)
        parser.delegate = self
        parser.parse()
    }

    task.resume()
}

// MARK: - XML Parser Delegate

func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:])
{
    currentElement = elementName
    if currentElement == "item" {
        currentTitle = ""
        currentDescription = ""
        currentPubDate = ""
    }
}

func parser(_ parser: XMLParser, foundCharacters string: String)
{
    switch currentElement {
//        case "title": currentTitle += string
//        case "description" : currentDescription += string
        case "pubDate" : currentPubDate += string
        default: break
    }
}

func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?)
{
    if elementName == "item" {
        let rssItem = RSSItem(title: currentTitle, description: currentDescription, pubDate: currentPubDate)
        self.rssItems.append(rssItem)
    }
}

func parserDidEndDocument(_ parser: XMLParser) {
    parserCompletionHandler?(rssItems)
}

func parser(_ parser: XMLParser, parseErrorOccurred parseError: Error)
{
    print(parseError.localizedDescription)
}
Geshode
  • 3,600
  • 6
  • 18
  • 32
Xiang Liu
  • 9
  • 5
  • I know there seems a solution for this link https://stackoverflow.com/questions/3352027/iphone-nsxmlparser-error-9, but its objective c and I don't know any of it. So I hope there will be a swift solution. – Xiang Liu Sep 25 '18 at 02:32

1 Answers1

0

I found an invalid byte 0xFC inside one of the CDATA element in the response of the URL you have shown.

This is invalid as a UTF-8 byte in a document declaring encoding="UTF-8".

You should better tell the server engineer of the URL, that the XML of the RSS feed is invalid.

If you need to work with this sort of ill-formed XML, you need to convert it to the valid UTF-8 data.

0xFC represents ü in ISO-LATIN-1, so you can write something like this.

func parseFeed(url: String, completionHandler: (([RSSItem]) -> Void)?)
{
    self.parserCompletionHandler = completionHandler

    let request = URLRequest(url: URL(string: url)!)
    let urlSession = URLSession.shared
    let task = urlSession.dataTask(with: request) { (data, response, error) in
        guard var data = data else { //###<-- `var` here
            if let error = error {
                print(error.localizedDescription)
            }

            return
        }

        //### When the input `data` cannot be decoded as a UTF-8 String,
        if String(data: data, encoding: .utf8) == nil {
            //Interpret the data as an ISO-LATIN-1 String,
            let isoLatin1 = String(data: data, encoding: .isoLatin1)!
            //And re-encode it as a valid UTF-8
            data = isoLatin1.data(using: .utf8)!
        }

        /// parse our xml data
        let parser = XMLParser(data: data)
        parser.delegate = self
        parser.parse()
    }

    task.resume()
}

If you need to work other encodings, the problem would be far more difficult, as it is hard to estimate the text encoding properly.


You may need to implement func parser(_ parser: XMLParser, foundCDATA CDATABlock: Data), but that seems to be another issue.

OOPer
  • 47,149
  • 6
  • 107
  • 142
  • i actually tried foundCDATA and try to convert it to String. But it still not working – Xiang Liu Sep 25 '18 at 04:01
  • As I wrote, implementing `parser(_:foundCDATA:)` is another issue than **NSXMLParserErrorDomain error 9**. Please do not just say _not working_, and describe the result with my code above. – OOPer Sep 25 '18 at 04:03
  • I am very sorry. I have tried your solution and it worked now. I did delete the 'data.dump()' as it said there is no dump. But it still works. Would you mind telling me why do you add 'data.dump()'? Again, thank you for your solution and please bear my misbehavior. – Xiang Liu Sep 25 '18 at 04:20
  • Sorry, that's remains of the old code to detect what was the cause of your issue. Deleting the line is the right solution to fix it. As you see, some answers may contain some faults, you have no need to hesitate to telling it, just that it is far better you include the description of _not working_. – OOPer Sep 25 '18 at 04:25