A more concise answer adapted to Python 3.x and using requests
and bs4
. There are two questions though in the original question. First, how to obtain the html:
import requests
html = requests.get("http://www.infolanka.com/miyuru_gee/art/art.html").content
Second, how to obtain artists name list:
import bs4
soup = bs4.BeautifulSoup(html)
artist_list = []
for i in soup.find_all("a"):
if i.parent.name == "dt":
artist_list.append(i.contents[0])
print(artist_list)
Output:
['Aathma Liyanage',
'Abewardhana Balasuriya',
'Aelian Thilakeratne',
'Ahamed Mohideen',
'Ajantha Nakandala',
'Ajith Ambalangoda',
'Ajith Ariayaratne',
'Ajith Muthukumarana',
'Ajith Paranawithana',
...]