0

Is there a way to get the data out of a div-container with ScraperWiki? I've got a line of HTML that is something like:

<div id="karte_data_aktuelle_temperatur___CHA" class="karte_text_hidden">
    <span style="font-size: 10px;">9.0</span>
    <br/>
</div>

and I would like to scrape the ...CHA and 9.0. The value (9.0) isn't a problem, since that can be done by CSS selectors, but how can I get the ...CHA value?

TankorSmash
  • 12,186
  • 6
  • 68
  • 106
CanadaRunner
  • 65
  • 10

1 Answers1

0

I realize that this isn't scraperwiki, but BeautifulSoup, check it out anyways.

html = r"""<div id="karte_data_aktuelle_temperatur___CHA" class="karte_text_hidden">
    <span style="font-size: 10px;">9.0</span>
    <br/>
</div>"""


from bs4 import BeautifulSoup

soup = BeautifulSoup(html)
elem = soup.find('div')

print elem['id'], 'is the id'
print elem.text, 'is the value' #9.0
TankorSmash
  • 12,186
  • 6
  • 68
  • 106
  • Thank you, it works really good for the id, but now i can embed the code for the value. Unfortunately I don't know BeautifulSoup. Where I've to implement which function to get the value? – CanadaRunner Apr 14 '13 at 17:35
  • @CanadaRunner You mean getting the value of `9.0`? Easy! `elem.text` or append a `strip()` call there and you'll get rid of the whitespace. – TankorSmash Apr 14 '13 at 19:01
  • Thanks for the fast answer! the problem is, it works perfect for the initiated html-string. Now i was working at a loop with the help from documentation [link](https://scraperwiki.com/docs/python/python_intro_tutorial/) to scrape the entire data, but there it doesnt work anymore! I changed every parameter in the cssselector, but now success! Here is the entire site: [link](http://www.meteoschweiz.admin.ch/web/de/wetter/aktuelles_wetter.par0001.html) Can you help me again? – CanadaRunner Apr 14 '13 at 21:31
  • @CanadaRunner I'd love to be more help, but the question is a little broad. Like I said, no experience with `scraperwiki`. A good trick in general is to identify what elements you're looking for in a browser, then the specific class and id, if you need them. Sorry I couldn't've been more help. Maybe make a new question with a specific question? – TankorSmash Apr 14 '13 at 21:34