0

I am quite newbie into bs4 and I am looking forward to extract a the table of prices.

The main problem I am facing is that in the html page the table element does not appear as so but it is a div . I have tried to look by class, id but I am not capable of obtaining the prices.

This is what I have tried:

url = "http://www.valoreazioni.com/indici/ftse-mib_ftsemib_mi"
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data,"html5lib")

Here are the filters I have applied in order to obtain the table of prices unsuccessfully

# table=soup.find('div',{'id':'maidMoneyTable'})
# table=soup.find(id='maidMoneyTable')

route=pd.read_html(str(tables),flavor='html5lib')

print(route)

in both cases the return is a no tables were found

Can anyone tell me how can I obtain the desired table?

JamesHudson81
  • 2,215
  • 4
  • 23
  • 42

1 Answers1

0

Scrape the data from the page using BeautifulSoup, keeping it in a sqlite3 table temporarily, then use pandas ability to process sql to get it from sqlite3 into pandas.

>>> import requests
>>> page = requests.get('http://www.valoreazioni.com/indici/ftse-mib_ftsemib_mi').content
>>> import bs4
>>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> maidMoneyTable = soup.find_all(id='maidMoneyTable')
>>> table_rows = maidMoneyTable.findAll('li', attrs={'class': 'order'})
>>> for row in table_rows:
...     link = row.find('a')
...     data = [link.attrs['href']] + [_.text for _ in link.findAll('li')]
...     result = c.execute('''INSERT INTO market VALUES (?,?,?,?,?,?,?)''', data)
... 
>>> df = pd.read_sql_query('SELECT * FROM market', conn)
>>> df.head()
                                                 url   symbol  \
0      http://www.valoreazioni.com/titoli/a2a-a2a-mi   A2A.MI   
1  http://www.valoreazioni.com/titoli/anima-holdi...  ANIM.MI   
2  http://www.valoreazioni.com/titoli/atlantia-at...   ATL.MI   
3  http://www.valoreazioni.com/titoli/azimut-hold...   AZM.MI   
4  http://www.valoreazioni.com/titoli/banca-medio...  BMED.MI   

                name  item_1  item_2  item_3   item_4  
0            A2A SpA    1.50   1.503   0.003  +0.200%  
1  ANIMA HOLDING SPA    6.26   6.210  -0.040   -0.64%  
2           ATLANTIA   25.96  25.640  -0.240   -0.93%  
3     AZIMUT HOLDING   17.94  17.930   0.060   +0.34%  
4   BANCA MEDIOLANUM    7.43   7.290  -0.150   -2.02%  
Bill Bell
  • 21,021
  • 5
  • 43
  • 58