-1

I am stuck here , it gives me httperror : forbidden in line 4 . when I try with other website then it is working , but in this website it won't work why??

from bs4 import BeautifulSoup as bs
from urllib.request import urlopen
import urllib.request

sauce=urllib.request.urlopen("https://socialblade.com/youtube/top/50").read()
soup=urlopen(sauce,'lxml')
print(soup)
  • See [How to Web Scrape using Beautiful Soup in Python without running into HTTP error 403](https://medium.com/@raiyanquaium/how-to-web-scrape-using-beautiful-soup-in-python-without-running-into-http-error-403-554875e5abed) and [HTTPError: HTTP Error 403: Forbidden](https://stackoverflow.com/questions/13055208/httperror-http-error-403-forbidden) – Shivam Jha Oct 07 '20 at 16:16

1 Answers1

0

Specify User-Agent HTTP header to get correct response from the server. For example:

import urllib.request
from urllib.request import urlopen
from bs4 import BeautifulSoup as bs

url = "https://socialblade.com/youtube/top/50"
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}

req = urllib.request.Request(url, headers=headers)
response = urllib.request.urlopen(req)
soup = bs(response.read(), 'html.parser')
print(soup.prettify())

Prints:

<!DOCTYPE html>
<head>
 <title>
  Top 50 YouTubers sorted by SB Score - Socialblade YouTube Stats | YouTube Statistics
 </title>

...
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91