I'm trying to learn basics of Python & I came across this exercise in a book on how to do Web Scraping. I tried to replicate the code but getting this error- "urllib.error.HTTPError: HTTP Error 406: Not Acceptable".
Is there anything wrong with the code?
I'm using Anaconda/VS Code on Windows 10.
Here's my code:
from urllib import request
from bs4 import BeautifulSoup
page_url = 'https://alansimpson.me/python/scrape_sample.html'
rawpage = request.urlopen(page_url)
soup = BeautifulSoup(rawpage, 'html5lib')
content = soup.article
links_list = []
for link in content.find_all('a'):
try:
url = link.get('href')
img = link.img.get('src')
text = link.span.text
links_list.append({'url' : url, 'img' : img, 'text' : text})
except AttributeError:
pass
And this is the error I'm getting-
Traceback (most recent call last):
File "c:\Users\srika\OneDrive\AIO_Python\scraper.py", line 6, in <module>
rawpage = request.urlopen(page_url)
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 406: Not Acceptable
I tried to install 'urllib' but it is already installed. Tried to add exception 'urllib.error.HTTPError', but none of them worked.
How do I solve this? Please help!