0

I am trying to get number of followers of a page programmatically either of exact number (eg 521356) or (521K) would do.

I have tried download data to download the entire page but I couldn't seem to find number of followers

System.Net.WebClient wc = new System.Net.WebClient();
            byte[] raw = wc.DownloadData("https://www.instagram.com/gallery.delband/");

            string webData = System.Text.Encoding.UTF8.GetString(raw);
            textBox1.Text = webData;

I would like to be able to get number of followers but I can't find the data using web browser method.

2 Answers2

1

WebClient just makes a simple HTTP request which will return very little for a lot of sites these days. You basically get a page that tells the browser "Great, now get that javascript bundle over there to get started". So to get the information you are after you'll need something more advanced like CefSharp to actually load the page and execute scripts and everything. Preferably you'd use CefSharp.OffScreen as to not show a browser window. Then you can parse out the information you wanted.

Karl-Johan Sjögren
  • 16,544
  • 7
  • 59
  • 68
1

The problem is, that you cannot get the instagram webpage like you see it in the browser without executing JavaScript. And System.Net.WebClient does not execute js.

But if you analyse the html source of the page, you'll see that the followers count is included within a <meta> tag with name="description":

<meta content="88.5k Followers, 1,412 Following, 785 Posts - See Instagram photos and videos from گالری نقره عیار ۹۲۵‌‌ ترکیه (@gallery.delband)" name="description" />

To grab this information from the source, use a regex:

var pattern = @"<meta content=\""([0-9k KMm\.,]+) Followers, .*\"" name=\""description\"" \/>";
var match = Regex.Match(webData, pattern);
var followers = match.Groups[1];

The pattern means: Find a string, that starts with <meta content=", followed by a dynamic string of the characters 0-9, k, K, M, m, ',', '.' or ' ' (the actual followers count) followed by the text " Followers", then any text, but ending with name="description" />. Because we parenthesized the dynamic part, the regex system is giving us this dynamic value as a group result.

Jann Westermann
  • 291
  • 1
  • 2