I am using instaloader to scrape instagram posts as part of a study project.
To avoid getting shut down by instagram, I use sleep function to sleep between 1-20 sec between each round. This works well.
I don't want to have to go through all posts each time I scrape, and therefore i want the loop to run 5 times. Which will give me 5 posts. But I don't seem to manage to get it to do it.
I had written the following function to try to scrape the profile and return the first 5 posts:
## importing and creating instance
from instaloader import Instaloader
from instaloader import Profile
import instaloader
import time
from random import randint
L = instaloader.Instaloader()
#random time for sleep
vent = randint(1,20)
# function:
def get2posts(profile_name):
profile = Profile.from_username(L.context, profile_name)
POSTS = profile.get_posts()
for post in POSTS:
for i in range(2):
L.download_post(post, profile_name)
time.sleep(vent)
break
print('scrape done')
This code returns 5 of the same posts though, and I simply can't figure out a way to get it to return the first 5 posts of an account.
The working function, which harvests all posts of a profile is:
# the original function (without range)
def get_posts(profile_name):
profile = Profile.from_username(L.context, profile_name)
POSTS = profile.get_posts()
for post in POSTS:
L.download_post(post, profile_name)
time.sleep(vent)
print('I am done')
Hope you can help :)