Parsing html in swift with Kanna

Question

I'm trying to parse a html into my swift project using Kanna, I used this What is the best practice to parse html in swift?, as a guide.

This is the code I'm using to parse the html:

if let doc = Kanna.HTML(html: myHTMLString, encoding: String.Encoding.utf8) {
    var bodyNode = doc.body

    if let inputNodes = bodyNode?.xpath("//a/@href[ends-with(.,'.txt')]") {
        for node in inputNodes {
            print(node.content)
        }
    }
}

Now I dont have any experince with this, but I believe that I have to change the .xpath("//a/@href[ends-with(.,'.txt')]") to get the specific information I need.

This is the html im trying to parse:

view-source:https://en.wikipedia.org/wiki/List_of_inorganic_compounds

What I want from this line is the title: "Aluminium antimonide" and the chemical formular: "AlSb".

Can anybody tell me what to write in the .xpath(...), or maybe explain to me how it works?

score 1 · Answer 1 · answered Mar 27 '17 at 17:30

Swift 3

To get all items with a loop

for item in doc.xpath("//div[@class='mw-content-ltr']/ul/li") {
    print(item.at_xpath("a")?["title"])
    print(item.text) // this returns the whole text, you may need further actions here
}

Or to access a specific item

print(doc.xpath("//div[@class='mw-content-ltr']/ul/li")[0].at_xpath("a")?["title"])
print(doc.xpath("//div[@class='mw-content-ltr']/ul/li")[0].text)

You can check xpath tutorials and docs for more.

Parsing html in swift with Kanna

1 Answers1