Hy, i am trying to scrape a web site https://www.dawn.com/pakistan but python find() find_all() method returns empty lists, i have tried the html5.parser, html5lib and lxml still no luck. Classes i am trying to scrape are present in the source code as well as in the soup object but things aren't seem to be working, any help will be appreciated thanks!
Code:
from bs4 import BeautifulSoup
import lxml
import html5lib
import urllib.request
url1 = 'https://www.dawn.com/pakistan'
req = urllib.request.Request(
url1,
data=None,
headers=
{
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
url1UrlContent=urllib.request.urlopen(req).read()
soup1=BeautifulSoup(url1UrlContent,'lxml')
url1Section1=soup1.find_all('h2', class_='story__title-size-five-text-black-
font--playfair-display')
print(url1Section1)