Basically, I'm trying to scrape a table in python using BeautifulSoup.
I've managed to scrape all the data in the other linked array, but for some reason when I add .text
, it prints both the text and the text inside the span tag. The span text is not needed.
I've tried to do .string
and .text.text
, but it doesn't seem to work.
Can anyone spot the problem here?
Here is my code:
soup = BeautifulSoup(urllib2.urlopen('http://www.livefootballontv.com/').read())
for row in soup('div', {'id': 'tv-guide'})[0]('ul'):
tds = row('li')
print tds[0].string, tds[1].text, tds[1].span.string, tds[2].string, tds[3].img['alt'], '\n'
db = MySQLdb.connect("127.0.0.1","root","","footballapp")
cursor = db.cursor()
sql = "INSERT INTO TVGuide(DATE, FIXTURE, COMPETITION, KICKOFF, CHANNELS) VALUES (%s,%s,%s,%s,%s)"
results = (str(tds[0].string), str(tds[1]).text, str(tds[1].span.string), str(tds[2].string), str(tds[3].img['alt']))
cursor.execute(sql, results)
db.commit()
db.rollback()
db.close()
Then I am given
Sunday 22 June 2014 USA vs PortugalBrasil World Cup 2014 Group G Brasil World Cup 2014 Group G 11:00pm BBC1
Tuesday 24 June 2014 Costa Rica vs EnglandBrasil World Cup 2014 Group D Brasil World Cup 2014 Group D 5:00pm ITV