Crawling and Scraping iTunes App Store

Question

I noticed that iTunes preview allows you to crawl and scrape pages via the http:// protocol. However, many of the links are trying to be opened in iTunes rather than the browser. For example, when you go to the iBooks page, it immediately tries opening a url with an itms:// protocol.

Are there any other methods of crawling the App Store or is this the only way?

Can the itms:// protocol links themselves be crawled somehow?

score 23 · Answer 1 · answered Oct 01 '10 at 23:20

23

I would have a decent look at the iTunes Search API and the iTunes Enterprise Partner API

Search API - http://www.apple.com/itunes/affiliates/resources/blog/introduction---search-api.html
Enterprise Partner API - http://www.apple.com/itunes/affiliates/resources/documentation/itunes-enterprise-partner-feed.html

You might get most/all of the information you need in a nice JSON file format.

If you can't get the information you need with the API, I would be interested what it is :)

answered Oct 01 '10 at 23:20

philipp

4,133
1
36
35

Search API only allows to search songs? – Saqib Saud Nov 06 '12 at 10:37
1

No. Search API allows to search all the content in the iTunes store. There are examples for this. – philipp Nov 15 '12 at 22:20
3

But only a maximum number of 200 – Iulian Onofrei Jul 10 '14 at 10:19

DiscDev · Answer 2 · 2014-09-30T21:28:02.837

5

As phillipp mentioned, the iTunes search API is an easy way to retrieve data about your App Store listings in JSON format.

Simply query for this with your app id (you can find the app id by viewing the web listing for your app at itunes.apple.com), ex:

http://itunes.apple.com/lookup?id=INSERT_YOUR_APP_ID_HERE

then, parse the resulting JSON to your heart's content.

edited Sep 30 '14 at 21:28

answered Dec 06 '13 at 16:02

DiscDev

38,652
20
117
133

score 4 · Answer 3 · answered Jul 08 '10 at 15:04

The only difference between http:// links and itms:// links is that you need to set your User-Agent to an iTunes user-agent, and depending on the version you may also have to include a verification code based on some not-so-secret algorithm.

For example this is the code for iTunes 9:

# Some magic. Generates a seed we use for X-Apple-Validation. Adapted from LWP::UserAgent::iTMS_Client.
function comp_seed($url, $user_agent) {
    $random  = sprintf( "%04X%04X", rand(0,0x10000), rand(0,0x10000) );
    $static  = base64_decode("ROkjAaKid4EUF5kGtTNn3Q==");
    $url_end = ( preg_match("|.*/.*/.*(/.+)$|",$url,$matches)) ? $matches[1] : '?';
    $digest  = md5(join("",array($url_end, $user_agent, $static, $random)) );
    return $random . '-' . strtoupper($digest);
}

However if you are only scraping, iTunes preview should work for your purposes, the link you gave us to the iBooks page had more than enough information to scrape.

score 1 · Answer 4 · answered Apr 21 '12 at 06:30

We tried scraping ourselves too about a year ago and it just became too much of a headache. Philipp's comment is a good one as the enterprise feed from apple (need to apply for it with a legitimate use) does have a good amount of useful info that you might be after in scraping.

There are a few companies that offer data as a service too - abto and AppMonsta are two I heard of when I was looking. I can't seem to find abto anymore but http://appmonsta.com seems to be. The search API looks ok (never experimented) but limited.

Good luck!

Crawling and Scraping iTunes App Store

4 Answers4