Scraping certain number of posts in Instagram

Question

I'm using the method the post link below to scraping instagram profiles. Can I change the number of images I retrieve? In the Json response I saw the 'has_next_page' parameter, but I'm not sure how to use it. Thanks in advance. Post link: What is the new instagram json endpoint?

Used code:

r = requests.get('https://www.instagram.com/' + profile + '/')
soup = BeautifulSoup(r.content)
scripts = soup.find_all('script', type="text/javascript", 
text=re.compile('window._sharedData'))
stringified_json = scripts[0].get_text().replace('window._sharedData = ', '')[:-1]
data = json.loads(stringified_json)['entry_data']['ProfilePage'][0]

Instagram have an API, you should use it instead of trying to scrape their website. — Daniel Roseman, Feb 04 '19 at 14:30

score 0 · Answer 1 · answered Feb 04 '19 at 14:35

0

You can find the Instagram API here: https://www.instagram.com/developer/ The documentatiopn is pretty neat I think, you just have to register to get an access token.

answered Feb 04 '19 at 14:35

Kata

142
1
7

Thanks but API has limitations and I have only few day to get the dataset (I need it for a project). – Francesco Feb 04 '19 at 15:00

score 0 · Answer 2 · edited Dec 12 '21 at 16:40

Your problem is the following: In your code you scrape data from the profile page, which means you only get the images which have been loaded already. That's why you can't just set a larger number for it to get you more images.

I'd recommend one of the following:

1. Use Instagram's API, which comes with already built methods to do exactly what you seem to want to achieve (don't reinvent the wheel).

2. If instead you want to do most of the work yourself (let's say as an exercise) I'd recommend that you use Selenium, which is an automation. In your code you use BeautifulSoup which is great for retrieving data from HTML files, but you need to do something more: scroll - this is in order to allow for more pictures to be loaded. This way you can get as many pictures as you like.

In case you need an example, you can check out an example of something similar I wrote for Twitter here

Scraping certain number of posts in Instagram

2 Answers2