0

What am I trying to do ? I am trying to use SerpApi's Google Scholar API for fetching articles of the publishers and was expecting all article titles to store in the array $title but it limits only to the first 100 articles.

Any Help Would be really appreciated, code :

title.py

from serpapi import GoogleSearch
import sys

id = sys.argv[1]
key = sys.argv[2]
params = {
  "engine": "google_scholar_author",
  "author_id": id,
  "api_key": key,
  "sort":"pubdate",
  "num":10000
}

search = GoogleSearch(params)
results = search.get_dict()
articles = results["articles"]

res = [ sub['title'] for sub in articles ]

print(res)

Controller

$title = shell_exec("python publicationScripts/title.py $gscID $key");
dd($title);

Output

The output shows only 100 articles but there are more than 200 articles 
  • Hey Ana :) This is expected as you haven't applied pagination. In your example, you're iterating over `articles` and extracting `title`. Here's how you would do pagination in Python: https://serpapi.com/blog/scrape-all-google-scholar-profile-author-results-with-python-and-serpapi/#author_article_results. Additionally, [`num`](https://serpapi.com/google-scholar-author-api#api-parameters-pagination-num) has a maximum of 100 results per page. It cannot display more than that. It's a Google restriction. – Dmitriy Zub Jan 10 '23 at 13:38

1 Answers1

0

This is expected as you haven't applied pagination. In your example, you're iterating over articles and extracting title. Here's how you would do pagination in Python.

Additionally, num has a maximum of 100 results per page. It cannot display more than that. It's a Google restriction

Thank you to Dmitriy for clarification. You can get more than 100 by using pagination.

Mark
  • 7,785
  • 2
  • 14
  • 34