I am currently building a simple media tracker (tried in a previous question but restarted from scratch) in python for a small organization that I work for. It works by searching key words on the news feature of google and returning all of the links that appear on the google search page. Where I am running into a problem is that I am trying to narrow my search to articles between specific dates by changing the original URL searched. While my code creates a link that yields articles based on my key words in the specific date requested, when I attempt to make a list of these links, for some reason, they are not within my requested date. Why may this be happening and how can I fix the problem?
Below is a sample code. When I click on the link: "https://www.google.com/search?q=Center+for+Community+Alternatives&tbs=cdr:1,cd_min:5/2/2019,cd_max:5/10/2019&tbm=nws" it gives me articles for search result between my dates, however, the list generated does not reflect the articles that appear.
import requests
import re
import bs4
import urllib
import pandas as pd
from bs4 import BeautifulSoup
from bs4 import BeautifulSoup as bs
from urllib.request import urlopen
search = 'Center+for+Community+Alternatives'
start_date = '5/2/2019'
end_date = '5/10/2019'
# The link to be searched initially
link4 = 'https://www.google.com/search?q=' + search + '&tbs=cdr:1,cd_min:' + start_date + ',cd_max:' + end_date + '&tbm=nws'
# Searching the link
page = requests.get(link4)
# Gathering the page content from the link
soup = BeautifulSoup(page.content)
# Finding all links
links = soup.findAll("a")
# Fixing up the links and putting them into a list
empty = []
fixed_list = []
finished_list = []
for link in soup.find_all("a",href=re.compile("(?<=/url\?q=)(htt.*://.*)")):
the = (re.split(":(?=http)",link["href"].replace("/url?q=","")))
empty.append(the)
for link in empty:
fixed_list.append(link[0])
for link in fixed_list:
finished_list.append(link.split('&sa',1)[0])
# The list of links
finished_list
# The original link searched
print(link4)
Note: I have tried using the googlenews module, however this does NOT give me the search results that I am looking for - the news feature on google is different than google news.