-1

I want to get data from a news website without RSS Feed, I just want to get titles name, And i am using this code -

var url = NSURL(string: "http://www.gulf-times.com/stories/c/192/0/Sport")

        if url != nil {
            let task = URLSession.shared.dataTask(with: url! as URL, completionHandler: { (data, response, error) -> Void in
                print(data)

                if error == nil {

                    var urlContent = NSString(data: data!, encoding: String.Encoding.ascii.rawValue) as NSString!

                    print(urlContent)
                }
            })
            task.resume()
        }

Problem is that, I am unable to get titles value-

enter image description here

iDeveloper
  • 2,339
  • 2
  • 24
  • 38
  • what is the value of `urlContent` – Vipin Kumar Nov 20 '17 at 07:52
  • didn't get you, what are you asking for? – iDeveloper Nov 20 '17 at 07:54
  • 1
    I'm not a swift developer, but I think it's a generic problem rather than language. So if you can help me with the value `urlContent` is holding, then I think solution can be provided. – Vipin Kumar Nov 20 '17 at 07:56
  • as I am print the value of urlContect, its provide- https://justpaste.it/1dpot – iDeveloper Nov 20 '17 at 08:01
  • Rather than those titles which are highlighted in you image and would need a javascript parser, can you not use some HTML tool to extract the same titles from the HTML? I.e. they all seem to be embedded lower down in an `

    ` element with class `bord-192`. Does that combination only apply to titles on that page?

    – Damien_The_Unbeliever Nov 20 '17 at 08:15
  • @Damien_The_Unbeliever, Only an HTML extractor can help me? Because as I am checking of format, Its like - ` – iDeveloper Nov 20 '17 at 08:23
  • It looks like there are plenty of [options for HTML parsing in swift](https://stackoverflow.com/q/31080818/15498). I personally would prefer to use a tool such as that and writing a nice concise XPath expression than starting (as you seem to here) with manually pulling apart mixed HTML and Javascript. – Damien_The_Unbeliever Nov 20 '17 at 11:10

2 Answers2

1

If you are comfortable with RegEx then use following pattern

/title = "(.*)"/g

This will give you all the titles.

Modified:

Please use like below

let matched = matches(for: "title = \"(.*)\"", in: contentOfPage)

matches : Function

func matches(for regex: String, in text: String) -> [String] {

    do {
        let regex = try NSRegularExpression(pattern: regex)
        let results = regex.matches(in: text,
                                    range: NSRange(text.startIndex..., in: text))
        return results.map {
            String(text[Range($0.range, in: text)!])
        }
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

Getting following result

[  
   "title = \"Al-Khelaifi voted Asian tennis Chairman\"",
   "title = \"Dimitrov downs Goffin for ATP Tour Finals crown\"",
   "title = \"Coach Lehmann calls for Australia to get behind Ashes selection\"",
   "title = \"Federer expects great things from returning trio\"",
   "title = \"Whateley wins Air Maroc League second stage\"",
   "title = \"Qatar-based Frijns finishes strong as Oliphant wins\"",
   "title = \"Sutton faces tough road ahead to get Chinese on track\"",
   "title = \"Qatar, Japan sign deal to import, export race horses\"",
   "title = \"Islanders deny Lightning comeback for third straight win\"",
   "title = \"Curry leads Golden Warriors fightback after Sixers blitz\"",
   "title = \"Fleetwood claims European Order of Merit as Rose falters\"",
   "title = \"Challengers win thriller against City Exchange\""
]
Vipin Kumar
  • 6,441
  • 1
  • 19
  • 25
0

You get the response of the url is a String then just parse the string into sub strings.

For example the title is the sub string between var title = and var brief =.

The String method like split, or components(separatedBy:_) etc. can make it.

William Hu
  • 15,423
  • 11
  • 100
  • 121