1

For some reason, I need to parse HTML of a website.

But the website 's content was blocked by the JavaScript.

And when I NSLog the string, the content of the HTML was not what I want.

The wrong HTML

</script><noscript><div class="error-container"><div class="wrapper"><header><a href="/"><img src="/resource/img/logo.png"><div class="title">Adobe Color CC</div></a></header><section><ul id="no-js" class="wrap"><li><h1>JavaScript Disabled</h1><p>Adobe Color CC requires JavaScript in order to load properly. **Please enable JavaScript in your browser and reload the page.**</p></li><li><h1>JavaScript est désactivé</h1><p>Pour pouvoir se charger correctement, Adobe Color CC requiert JavaScript. Veuillez activer JavaScript dans votre navigateur et recharger la page.</p></li><li><h1>JavaScript deaktiviert</h1><p>JavaScript ist erforderlich, damit Adobe Color CC ordnungsgemäß geladen wird. Aktivieren Sie JavaScript im Browser und laden Sie die Seite neu.</p></li><li><h1>JavaScript が無効です</h1><p>Adobe Color CC で正しく読み込みを行うには、JavaScript が必要です。ご使用のブラウザーで JavaScript を有効にして、ページを再読み込みしてください。`

Get content Method

NSURL *htmlUrl = [NSURL URLWithString:@"https://color.adobe.com/explore/newest/?time=all"];

NSStringEncoding htmlEncoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingUTF8);

NSString *htmlString = [NSString stringWithContentsOfURL:htmlUrl encoding:htmlEncoding error:nil];

NSLog(@"%@",htmlString);

What should I do?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Zigii Wong
  • 7,766
  • 8
  • 51
  • 79

1 Answers1

0

stringWithContentsOfURL is really not designed for this. It's meant to take in data, such as JSON or XML which is static.

I would suggest using other networking classes instead which will provide more control as well as the ability to make it asynchronous. The answers to Making stringWithContentsOfURL asynchronous - Is it safe? should provide some more detail.

Community
  • 1
  • 1
Rick
  • 3,240
  • 2
  • 29
  • 53