I'm trying to retrieve pageviews info on a page which is not retrieved, while other pages are. I get the error:
File "<unknown>", line 1
article =='L'amica_geniale_ (serie_di_romanzi )'
^
SyntaxError: invalid syntax
But there are no whitespaces in the text. this page is: https://it.wikipedia.org/wiki/L%27amica_geniale_(serie_di_romanzi)
The code is:
start_date = "2005/01/01"
headers = {
'User-Agent': 'Mozilla/5.0'
}
def wikimedia_request(page_name, start_date, end_date = None):
sdate = start_date.split("/")
sdate = ''.join(sdate)
r = requests.get(
"https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/all-agents/{}/daily/{}/{}".format(page_name,sdate, edate),
headers=headers
)
r.raise_for_status() # raises exception when not a 2xx response
result = r.json()
df = pd.DataFrame(result['items'])
df['timestamp'] = [i[:-2] for i in df.timestamp]
df['timestamp'] = pd.to_datetime(df['timestamp'])
df.set_index('timestamp', inplace = True)
return df[['article', 'views']]
df = wikimedia_request(name="Random", start_date)
names = ["L'amica geniale"]
dfs = pd.concat([wikimedia_request(x, start_date) for x in names])
And the code works except for this page. I'm thinking it might be something with the apostrophe