0

Trying to retrieve: "17,02" from the HTML below:

<div class="overflow-auto">
    <table class="w-100 tl mb4 mt3 f6" cellspacing="0">
        <thead>
            <tr>
                <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Kvalitet</th>
                <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Pris inkl. mva.</th>
                <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Endring</th>
                <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Gjeldene fra</th>
            </tr>
        </thead>
        <tbody class="lh-copy">
            <tr>
                <td class="pv3 pr3 bb b--black-20"><img src="./assets/95 Miles.png" alt="95 Miles"></td>
                <td class="pv3 pr3 bb b--black-20">Kr 17,02</td>
                <td class="pv3 pr3 bb b--black-20">5 øre</td>
                <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
            </tr>
        </tbody>
        <tbody class="lh-copy">
            <tr>
                <td class="pv3 pr3 bb b--black-20"><img src="./assets/D Miles.png" alt="D Miles"></td>
                <td class="pv3 pr3 bb b--black-20">Kr 15,80</td>
                <td class="pv3 pr3 bb b--black-20">5 øre</td>
                <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
            </tr>
        </tbody>
        <tbody class="lh-copy">
            <tr>
                <td class="pv3 pr3 bb b--black-20"><img src="./assets/95 Miles Plus.png" alt="95 Miles"></td>
                <td class="pv3 pr3 bb b--black-20">Kr 18,01</td>
                <td class="pv3 pr3 bb b--black-20">5 øre</td>
                <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
            </tr>
        </tbody>
        <tbody class="lh-copy">
            <tr>
                <td class="pv3 pr3 bb b--black-20"><img src="./assets/D Miles Plus.png" alt="D Miles"></td>
                <td class="pv3 pr3 bb b--black-20">Kr 16,79</td>
                <td class="pv3 pr3 bb b--black-20">5 øre</td>
                <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
            </tr>
        </tbody>
    </table>
</div>

I've tried using this code in swift:

let titles = try doc.getElementsByClass("pv3 pr3 bb b--black-20").array()

But when I try to print it out I get nil back. Does anyone have any solutions or ideas?

Derek 朕會功夫
  • 92,235
  • 44
  • 185
  • 247
Shift
  • 43
  • 9

1 Answers1

1

To select an element that belongs to at least one of many classes, separate these classes with a comma:

let tds: [Element] = try doc.select(".pv3, .pr3, .bb, .b--black-20")

Use this to select the second td:

let doc: Document = try SwiftSoup.parse(html)
let td: Element = try doc.select("tbody tr td").array()[1]
let text: String = try td.text()

The selector "tbody tr td" looks for all tds inside a tr inside of a tbody. And then we know that the second td is the one we want. So, we convert the result to an array, and then the we select the second element in that array by using the subscript [1].

If you're sure you want just the second td in your html document, then the selector could be shortened :

let td: Element = try doc.select("td").array()[1]

If you want to get all the second tds in your table which text starts with "Kr " :

let tds: [Element] = try doc.select("tr td").array().filter { try $0.text().starts(with: "Kr ")}
let labels: [String] = try tds.map {try $0.text()}

If you want the text of these tds but without "Kr " :

let tds: [Element] = try doc.select("td").array().filter { try $0.text().starts(with: "Kr ")}
let titlesWithoutKr: [String] = try tds.map {try String($0.text().dropFirst(3))}

Here is the final code:

do {
    let html: String =  """
                        <div class="overflow-auto">
                            <table class="w-100 tl mb4 mt3 f6" cellspacing="0">
                                <thead>
                                    <tr>
                                        <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Kvalitet</th>
                                        <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Pris inkl. mva.</th>
                                        <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Endring</th>
                                        <th class="fw6 bb b--black-20 tl pb3 pr3 bg-white tl">Gjeldene fra</th>
                                    </tr>
                                </thead>
                                <tbody class="lh-copy">
                                    <tr>
                                        <td class="pv3 pr3 bb b--black-20"><img src="./assets/95 Miles.png" alt="95 Miles"></td>
                                        <td class="pv3 pr3 bb b--black-20">Kr 17,02</td>
                                        <td class="pv3 pr3 bb b--black-20">5 øre</td>
                                        <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
                                    </tr>
                                </tbody>
                                <tbody class="lh-copy">
                                    <tr>
                                        <td class="pv3 pr3 bb b--black-20"><img src="./assets/D Miles.png" alt="D Miles"></td>
                                        <td class="pv3 pr3 bb b--black-20">Kr 15,80</td>
                                        <td class="pv3 pr3 bb b--black-20">5 øre</td>
                                        <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
                                    </tr>
                                </tbody>
                                <tbody class="lh-copy">
                                    <tr>
                                        <td class="pv3 pr3 bb b--black-20"><img src="./assets/95 Miles Plus.png" alt="95 Miles"></td>
                                        <td class="pv3 pr3 bb b--black-20">Kr 18,01</td>
                                        <td class="pv3 pr3 bb b--black-20">5 øre</td>
                                        <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
                                    </tr>
                                </tbody>
                                <tbody class="lh-copy">
                                    <tr>
                                        <td class="pv3 pr3 bb b--black-20"><img src="./assets/D Miles Plus.png" alt="D Miles"></td>
                                        <td class="pv3 pr3 bb b--black-20">Kr 16,79</td>
                                        <td class="pv3 pr3 bb b--black-20">5 øre</td>
                                        <td class="pv3 pr3 bb b--black-20">24.08.2018</td>
                                    </tr>
                                </tbody>
                            </table>
                        </div>
                        """
    let doc: Document = try SwiftSoup.parse(html)
    let tds: [Element] = try doc.select("td").array().filter { try $0.text().starts(with: "Kr ")}
    let titlesWithoutKr: [String] = try tds.map {try String($0.text().dropFirst(3))}
    print(titlesWithoutKr)
} catch Exception.Error( _, let message) {
    print(message)
} catch {
    print("error")
}

And it prints ["17,02", "15,80", "18,01", "16,79"].

For more documentation on how to use SwiftSoup, have a look here.

ielyamani
  • 17,807
  • 10
  • 55
  • 90